This patch implements PROCFS_SET_GROUPS ioctl for /proc/self/status
files. The fd must be transferred with SCM_RIGHTS to another process
in order for this ioctl to work. Below the process that opened the
file, is referred to as "opener" and the process that calls this ioctl(),
is called "setter".
The following checks are performed about the opener:
- Opener opened his own status, not someone else's.
- Opener process still exist, so his proc entry is still valid.
- The groups weren't changed since opening the file.
- Opener was capable of CAP_SYS_ADMIN when opening the file.
The following checks are performed about the setter:
- Setter is a different process (thread) than an opener.
- PFA_NO_NEW_PRIVS not set.
- Either the process must be setgid-capable, or all the below must be
true:
- euid==uid==suid i.e. no possibility to switch uid
- egid==gid==sgid i.e. no possibility to switch gid
- euid/egid of current process matches uid/gid of an opener process
had at the time of opening the file.
The setter can read from an fd to find out what group list is there
to decide whether or not he wants to use it. The opener can't trick
that by changing the groups in between, as then ioctl() detects that
the group list was ever changed and fails with -EPERM.
This is also the reason why the opener is required to "exist" (and
not exit) for the duration of entire operation: otherwise his status
file may not be readable by setter.
Why /proc/self/status?
Because this file carries the group list and other credentials
already. Its quite natural to add an ioctl to apply whatever is
already there. Client is therefore able to read the needed information
and validate it in any way he wants, before applying. It is guaranteed
that the information ever read, would match the one applied, because
any change of the group info after the file was opened, disables the
ioctl.
What problem does this solve?
Currently there is no way to change the group list if the process
uses suid/sgid bits to switch to a less-privileged user to restrict
the access to the original user's files. So such restriction itself
requires CAP_SETGID to work. I.e. you need cap to drop the rights.
This ioctl allows to move CAP_SETGID from the main process to the
helper (server) process, keeping the main process cap-less.
This particular implementation also requires CAP_SYS_ADMIN on helper
side, but there may be a finer-grained approaches.
Usage scenario:
Main process connects to the helper via the AF_UNIX socket. Helper
uses SO_PEERCRED to retrieve client's uid/gid and then does initgroups()
with client's uid/gid to set the desired group list. Then it sets
uid/gid to match client's. It should keep his euid/egid untouched
or keep caps, as CAP_SYS_ADMIN check is performed by this ioctl.
Then it opens /proc/self/status and sends it to client with SCM_RIGHTS.
Client then reads from fd to validate the info in any way he wants,
and uses ioctl(PROCFS_SET_GROUPS) on that fd to actually set up the
received group list. It then replies to the server with the operation
status, or just closes the connection. At that point server is allowed
to exit.
Security considerations:
As explained above, the server process, when opening the status file,
must have uid/gid matching those of client (getting them via
SO_PEERCRED), or ioctl() fails. This may mean either of the below:
1. The server is setuid/setgid-capable. In this case we trust it does
the right thing, namely, calls initgroups() with client's creds.
2. The server was capable but dropped the caps after changing creds.
Same as above.
3. The server got the appropriate creds via login process. In this
case it can transfer only the proper group list, as he doesn't
have anything else and can't switch anything.
4. The server used suid/sgid bits to adjust the uid/gid. In this
case he doesn't have the right group list to share.
5. The server was spawned by someone else capable, like
`sudo setpriv --clear-groups --ruid=$RUID --rgid=$RGID ./server`
and has the appropriate uid/gid but wrong group list (or empty
group list, in case of --clear-groups).
Cases 4 and 5 have a malicious potential and are difficult to
distinguish from 2 and 3 on kernel level.
While it may be possible to come up with the more fine-grained
strategy, in this patch I took the simple approach: check the server
for CAP_SYS_ADMIN. In that case he definitely has the right to send
the needed groups to anyone. Server explicitly expresses his will to
do so by changing uid/gid to client's before opening the file. Sending
/proc/self/status via SCM_RIGHTS probably never happened "in a wild"
before, and for sure in any pre-existing practice CAP_SYS_ADMIN-capable
server won't change his uid/gid to client's before opening that file.
So I figured these steps are explicit enough to not add a new open()
flag or another ioctl() to allow the group info sharing in a yet more
explicit way. That check is not performed if the client is setgid-capable,
as in this case it could set the similar group list by other means
anyway.
It may be useful to inherit such fds via exec(), so O_CLOEXEC is
not required in this patch.
PFA_NO_NEW_PRIVS disables this ioctl.
Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
CC: Eric Biederman <ebiederm@xmission.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: Aleksa Sarai <cyphar@cyphar.com>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Christian Brauner <brauner@kernel.org>
CC: Jan Kara <jack@suse.cz>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Jeff Layton <jlayton@kernel.org>
CC: John Johansen <john.johansen@canonical.com>
CC: Chengming Zhou <chengming.zhou@linux.dev>
CC: Casey Schaufler <casey@schaufler-ca.com>
CC: Adrian Ratiu <adrian.ratiu@collabora.com>
CC: Felix Moessbauer <felix.moessbauer@siemens.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Oleg Nesterov <oleg@redhat.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
CC: linux-kernel@vger.kernel.org
CC: linux-fsdevel@vger.kernel.org
---
fs/proc/base.c | 137 +++++++++++++++++++++++++++++++++++++++-
include/linux/cred.h | 4 ++
include/uapi/linux/fs.h | 2 +
3 files changed, 141 insertions(+), 2 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 015db8752a99..67fae857372f 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -98,6 +98,7 @@
#include <linux/resctrl.h>
#include <linux/cn_proc.h>
#include <linux/ksm.h>
+#include <linux/cred.h>
#include <uapi/linux/lsm.h>
#include <trace/events/oom.h>
#include "internal.h"
@@ -829,6 +830,138 @@ static const struct file_operations proc_single_file_operations = {
};
+static int proc_status_open(struct inode *inode, struct file *filp)
+{
+ struct proc_inode *pi = PROC_I(inode);
+ struct pid *opener = get_task_pid(current, PIDTYPE_PID);
+
+ pi->op.proc_show = proc_pid_status;
+ return single_open(filp, proc_single_show, opener);
+}
+
+static int proc_status_release(struct inode *inode, struct file *file)
+{
+ struct seq_file *seq = file->private_data;
+ struct pid *opener = seq->private;
+
+ put_pid(opener);
+ return single_release(inode, file);
+}
+
+static bool can_borrow_groups(const struct cred *cur_cred,
+ const struct cred *f_cred)
+{
+ if (may_setgroups())
+ return 1;
+ /* Make sure the process can't switch uid/gid. */
+ if (!uid_eq(cur_cred->euid, cur_cred->uid) ||
+ !uid_eq(cur_cred->suid, cur_cred->uid))
+ return 0;
+ if (!gid_eq(cur_cred->egid, cur_cred->gid) ||
+ !gid_eq(cur_cred->sgid, cur_cred->gid))
+ return 0;
+ /* Make sure the euid/egid of current processes are equal
+ * to uid/gid of an opener at file open time.
+ */
+ if (!uid_eq(f_cred->uid, cur_cred->euid) ||
+ !gid_eq(f_cred->gid, cur_cred->egid))
+ return 0;
+ return 1;
+}
+
+static int do_proc_setgroups(const struct cred *task_cred,
+ const struct cred *cur_cred,
+ const struct cred *f_cred)
+{
+ struct group_info *cgi = get_group_info(cur_cred->group_info);
+ struct group_info *gi = get_group_info(task_cred->group_info);
+ int err;
+
+ /* Make sure groups didn't change since file open. */
+ err = -EPERM;
+ if (f_cred->group_info != gi)
+ goto out_gi;
+ /* Don't error if the process is setting the same list again. */
+ err = 0;
+ if (cgi == gi)
+ goto out_gi;
+
+ err = -EPERM;
+ if (!can_borrow_groups(cur_cred, f_cred))
+ goto out_gi;
+ err = set_current_groups(gi);
+
+out_gi:
+ put_group_info(gi);
+ put_group_info(cgi);
+ return err;
+}
+
+static int do_status_ioctl(struct task_struct *task, struct file *file,
+ unsigned int cmd, unsigned long arg)
+{
+ const struct cred *task_cred;
+ const struct cred *cur_cred;
+ int err = -EINVAL;
+
+ switch (cmd) {
+ case PROCFS_SET_GROUPS:
+ if (arg)
+ break;
+ /* Disallow opener process to set his own groups. */
+ err = -EPERM;
+ if (task == current)
+ break;
+ /* Don't change anything if current has NO_NEW_PRIVS. */
+ if (task_no_new_privs(current))
+ break;
+ /* Opener must be capable of granting his groups. */
+ if (!file_ns_capable(file, &init_user_ns, CAP_SYS_ADMIN))
+ break;
+ task_cred = get_task_cred(task);
+ cur_cred = get_current_cred();
+ err = do_proc_setgroups(task_cred, cur_cred, file->f_cred);
+ put_cred(cur_cred);
+ put_cred(task_cred);
+ break;
+ }
+ return err;
+}
+
+static long proc_status_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ struct inode *inode = file_inode(file);
+ struct pid *pid = proc_pid(inode);
+ struct seq_file *seq = file->private_data;
+ struct pid *opener = seq->private;
+ struct task_struct *task;
+ long err;
+
+ /* Make sure opener opened his own proc entry. */
+ if (pid != opener)
+ return -EPERM;
+
+ task = get_pid_task(pid, PIDTYPE_PID);
+ if (!task)
+ return -ESRCH;
+
+ err = do_status_ioctl(task, file, cmd, arg);
+
+ put_task_struct(task);
+ return err;
+}
+
+static const struct file_operations proc_status_operations = {
+ .open = proc_status_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = proc_status_release,
+ .unlocked_ioctl = proc_status_ioctl,
+ .compat_ioctl = proc_status_ioctl,
+};
+
+
struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode)
{
struct task_struct *task = get_proc_task(inode);
@@ -3314,7 +3447,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#endif
REG("environ", S_IRUSR, proc_environ_operations),
REG("auxv", S_IRUSR, proc_auxv_operations),
- ONE("status", S_IRUGO, proc_pid_status),
+ REG("status", S_IRUGO, proc_status_operations),
ONE("personality", S_IRUSR, proc_pid_personality),
ONE("limits", S_IRUGO, proc_pid_limits),
#ifdef CONFIG_SCHED_DEBUG
@@ -3665,7 +3798,7 @@ static const struct pid_entry tid_base_stuff[] = {
#endif
REG("environ", S_IRUSR, proc_environ_operations),
REG("auxv", S_IRUSR, proc_auxv_operations),
- ONE("status", S_IRUGO, proc_pid_status),
+ REG("status", S_IRUGO, proc_status_operations),
ONE("personality", S_IRUSR, proc_pid_personality),
ONE("limits", S_IRUGO, proc_pid_limits),
#ifdef CONFIG_SCHED_DEBUG
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 2976f534a7a3..cfdeebbd7db6 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -83,6 +83,10 @@ static inline int groups_search(const struct group_info *group_info, kgid_t grp)
{
return 1;
}
+static inline bool may_setgroups(void)
+{
+ return 1;
+}
#endif
/*
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 753971770733..d4b3fbfdac79 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -552,4 +552,6 @@ struct procmap_query {
__u64 build_id_addr; /* in */
};
+#define PROCFS_SET_GROUPS _IO(PROCFS_IOCTL_MAGIC, 18)
+
#endif /* _UAPI_LINUX_FS_H */
--
2.47.0
Hi Stas, kernel test robot noticed the following build errors: [auto build test ERROR on brauner-vfs/vfs.all] [also build test ERROR on linus/master v6.12-rc6 next-20241107] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Stas-Sergeev/procfs-avoid-some-usages-of-seq_file-private-data/20241108-060856 base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all patch link: https://lore.kernel.org/r/20241107215821.1514623-3-stsp2%40yandex.ru patch subject: [PATCH 2/2] procfs: implement PROCFS_SET_GROUPS ioctl config: arc-randconfig-001-20241108 (https://download.01.org/0day-ci/archive/20241108/202411081749.pXnGtt8H-lkp@intel.com/config) compiler: arc-elf-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241108/202411081749.pXnGtt8H-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411081749.pXnGtt8H-lkp@intel.com/ All errors (new ones prefixed by >>): fs/proc/base.c: In function 'do_proc_setgroups': >> fs/proc/base.c:891:15: error: implicit declaration of function 'set_current_groups'; did you mean 'get_current_groups'? [-Werror=implicit-function-declaration] 891 | err = set_current_groups(gi); | ^~~~~~~~~~~~~~~~~~ | get_current_groups cc1: some warnings being treated as errors Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for GET_FREE_REGION Depends on [n]: SPARSEMEM [=n] Selected by [y]: - RESOURCE_KUNIT_TEST [=y] && RUNTIME_TESTING_MENU [=y] && KUNIT [=y] vim +891 fs/proc/base.c 870 871 static int do_proc_setgroups(const struct cred *task_cred, 872 const struct cred *cur_cred, 873 const struct cred *f_cred) 874 { 875 struct group_info *cgi = get_group_info(cur_cred->group_info); 876 struct group_info *gi = get_group_info(task_cred->group_info); 877 int err; 878 879 /* Make sure groups didn't change since file open. */ 880 err = -EPERM; 881 if (f_cred->group_info != gi) 882 goto out_gi; 883 /* Don't error if the process is setting the same list again. */ 884 err = 0; 885 if (cgi == gi) 886 goto out_gi; 887 888 err = -EPERM; 889 if (!can_borrow_groups(cur_cred, f_cred)) 890 goto out_gi; > 891 err = set_current_groups(gi); 892 893 out_gi: 894 put_group_info(gi); 895 put_group_info(cgi); 896 return err; 897 } 898 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Hi Stas, kernel test robot noticed the following build errors: [auto build test ERROR on brauner-vfs/vfs.all] [also build test ERROR on linus/master v6.12-rc6 next-20241108] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Stas-Sergeev/procfs-avoid-some-usages-of-seq_file-private-data/20241108-060856 base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all patch link: https://lore.kernel.org/r/20241107215821.1514623-3-stsp2%40yandex.ru patch subject: [PATCH 2/2] procfs: implement PROCFS_SET_GROUPS ioctl config: x86_64-buildonly-randconfig-004-20241108 (https://download.01.org/0day-ci/archive/20241108/202411081700.kAQXSC2u-lkp@intel.com/config) compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241108/202411081700.kAQXSC2u-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411081700.kAQXSC2u-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from fs/proc/base.c:66: In file included from include/linux/mm.h:2213: include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ >> fs/proc/base.c:891:8: error: call to undeclared function 'set_current_groups'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 891 | err = set_current_groups(gi); | ^ 1 warning and 1 error generated. vim +/set_current_groups +891 fs/proc/base.c 870 871 static int do_proc_setgroups(const struct cred *task_cred, 872 const struct cred *cur_cred, 873 const struct cred *f_cred) 874 { 875 struct group_info *cgi = get_group_info(cur_cred->group_info); 876 struct group_info *gi = get_group_info(task_cred->group_info); 877 int err; 878 879 /* Make sure groups didn't change since file open. */ 880 err = -EPERM; 881 if (f_cred->group_info != gi) 882 goto out_gi; 883 /* Don't error if the process is setting the same list again. */ 884 err = 0; 885 if (cgi == gi) 886 goto out_gi; 887 888 err = -EPERM; 889 if (!can_borrow_groups(cur_cred, f_cred)) 890 goto out_gi; > 891 err = set_current_groups(gi); 892 893 out_gi: 894 put_group_info(gi); 895 put_group_info(cgi); 896 return err; 897 } 898 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
© 2016 - 2024 Red Hat, Inc.