From nobody Fri Apr 17 00:23:15 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF11C3C2785; Mon, 13 Apr 2026 11:22:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079350; cv=none; b=aF1YVd+D3G1RjzbQKRudZaPhsI60QIUxpZ0wcyRtBjXM6eYM3ReTlX4jE6tpzhZ6T0Ebn4jHhQbLK+P2IGvr2vLqb84xw9QHAGAC6xwqcUnHRQVrjuYkLIvr5jhRdGxicrxqDsCPReNuYGYfQTsB9XZ24Rqecijou/KoQWHO7So= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079350; c=relaxed/simple; bh=+OQ5CG/U3jB5oYEUYPwcxgV6vHr16i5/CVGIz88bW0I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pulN1dVsbVe9t+pzob0KGJupisiXpQPFhwEPlSV3MU5NymPeThKaL4WqwYp7UtvLTRC4+nQ7VHdbjPLpv9twqnF7RlVkSiktpHFgxQevx4LnE3rktu6V34iqpBq7Q20/Tzn2Zs6z7wXjBUq3GxUsp4tBf+4Vqhfjj/6Km5+2EAY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fMnnVUVd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fMnnVUVd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A3AFDC19421; Mon, 13 Apr 2026 11:22:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776079350; bh=+OQ5CG/U3jB5oYEUYPwcxgV6vHr16i5/CVGIz88bW0I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fMnnVUVdby4vG1pbUjY1TCUZh5/woaFbZWiJn/w0ni7QQ+CfFex3wDZr/qg0uimor 5t4v4wHJJJPiMBkyIpJ/zQfc6DcDXSnwEenNwqZe58iLUrT5MI+Ts1vN2XkvoUxI/f 8wCieQtzfOaa96hMTGKSbaEEdNNHUXV/G2Ps3Uey2ji3DEiXwjFtFDRyuRGxaf4eAe xRfBBx9zh2l1rUCCKSyJ/EtN7kdb1plymlnFftaqa1UkrQxxf7aPRLR3Q5ei/Z1Upy DqdZEd9iS3k3wPBpP27YwgRv7hUPQBT78s2/mSOfqed91jvnWUcxFrSxwKQ6ufNfzK k4awtLHvzB3SQ== From: Alexey Gladkov To: Christian Brauner , Dan Klishch Cc: Al Viro , "Eric W . Biederman" , Kees Cook , containers@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 1/5] namespace: record fully visible mounts in list Date: Mon, 13 Apr 2026 13:19:40 +0200 Message-ID: <684859a8e0ac929cb89c1fbe16ce15b30c70eb1f.1776079055.git.legion@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Christian Brauner Instead of wading through all the mounts in the mount namespace rbtree to find fully visible procfs and sysfs mounts, be honest about them being special cruft and record them in a separate per-mount namespace list. Signed-off-by: Christian Brauner --- fs/mount.h | 4 ++++ fs/namespace.c | 19 +++++++++++-------- 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/fs/mount.h b/fs/mount.h index e0816c11a198..5df134d56d47 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -25,6 +25,7 @@ struct mnt_namespace { __u32 n_fsnotify_mask; struct fsnotify_mark_connector __rcu *n_fsnotify_marks; #endif + struct hlist_head mnt_visible_mounts; /* SB_I_USERNS_VISIBLE mounts */ unsigned int nr_mounts; /* # of mounts in the namespace */ unsigned int pending_mounts; refcount_t passive; /* number references not pinning @mounts */ @@ -90,6 +91,7 @@ struct mount { int mnt_expiry_mark; /* true if marked for expiry */ struct hlist_head mnt_pins; struct hlist_head mnt_stuck_children; + struct hlist_node mnt_ns_visible; /* link in ns->mnt_visible_mounts */ struct mount *overmount; /* mounted on ->mnt_root */ } __randomize_layout; =20 @@ -207,6 +209,8 @@ static inline void move_from_ns(struct mount *mnt) ns->mnt_first_node =3D rb_next(&mnt->mnt_node); rb_erase(&mnt->mnt_node, &ns->mounts); RB_CLEAR_NODE(&mnt->mnt_node); + if (!hlist_unhashed(&mnt->mnt_ns_visible)) + hlist_del_init(&mnt->mnt_ns_visible); } =20 bool has_locked_children(struct mount *mnt, struct dentry *dentry); diff --git a/fs/namespace.c b/fs/namespace.c index 854f4fc66469..539b74403072 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -321,6 +321,7 @@ static struct mount *alloc_vfsmnt(const char *name) INIT_HLIST_NODE(&mnt->mnt_slave); INIT_HLIST_NODE(&mnt->mnt_mp_list); INIT_HLIST_HEAD(&mnt->mnt_stuck_children); + INIT_HLIST_NODE(&mnt->mnt_ns_visible); RB_CLEAR_NODE(&mnt->mnt_node); mnt->mnt.mnt_idmap =3D &nop_mnt_idmap; } @@ -1098,6 +1099,10 @@ static void mnt_add_to_ns(struct mnt_namespace *ns, = struct mount *mnt) rb_link_node(&mnt->mnt_node, parent, link); rb_insert_color(&mnt->mnt_node, &ns->mounts); =20 + if ((mnt->mnt.mnt_sb->s_iflags & SB_I_USERNS_VISIBLE) && + mnt->mnt.mnt_root =3D=3D mnt->mnt.mnt_sb->s_root) + hlist_add_head(&mnt->mnt_ns_visible, &ns->mnt_visible_mounts); + mnt_notify_add(mnt); } =20 @@ -6310,22 +6315,20 @@ static bool mnt_already_visible(struct mnt_namespac= e *ns, int *new_mnt_flags) { int new_flags =3D *new_mnt_flags; - struct mount *mnt, *n; + struct mount *mnt; + + /* Don't acquire namespace semaphore without a good reason. */ + if (hlist_empty(&ns->mnt_visible_mounts)) + return false; =20 guard(namespace_shared)(); - rbtree_postorder_for_each_entry_safe(mnt, n, &ns->mounts, mnt_node) { + hlist_for_each_entry(mnt, &ns->mnt_visible_mounts, mnt_ns_visible) { struct mount *child; int mnt_flags; =20 if (mnt->mnt.mnt_sb->s_type !=3D sb->s_type) continue; =20 - /* This mount is not fully visible if it's root directory - * is not the root directory of the filesystem. - */ - if (mnt->mnt.mnt_root !=3D mnt->mnt.mnt_sb->s_root) - continue; - /* A local view of the mount flags */ mnt_flags =3D mnt->mnt.mnt_flags; =20 --=20 2.53.0 From nobody Fri Apr 17 00:23:15 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1746D3AF667; Mon, 13 Apr 2026 11:22:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079353; cv=none; b=IVaSoVu403HObIDtz9g4ohgK39/GNSClVad5RPXCcPjXeIRCRua3I73Ss5xKOIK0qeOjLUuqFxSTptd57iWWrcAgx7QTUVqbeAGlQZKMGchk+DjWTQWvfXBKEdAreBgr2qATcvRxYQGaC3zUqEnFeCgxXBoCyY5d8UQi3yRlxtQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079353; c=relaxed/simple; bh=FQCr7qSm4xiBoSTBWH1BWq5L0Ay30j0R812NdJ+wUOo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=s95zweFacok+PRi1tGRqcFL5QMqbkZjzeibq0NE0bU2bpTER5PjTScwtL9VZEAui+4jARAya6lqC8iGNdwLjkzCDKjslXrBoSzXaQgdjuaU1moyy3RcPitBr/9jG+dP3sSPSK7Lm65eMNAGZrxhBQNMVxx2SRT+7KF5Bxozs7LY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LUlf+U5a; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LUlf+U5a" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E68BFC19421; Mon, 13 Apr 2026 11:22:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776079352; bh=FQCr7qSm4xiBoSTBWH1BWq5L0Ay30j0R812NdJ+wUOo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LUlf+U5aaLSmGeHO6JW2lnpMimJXxFosndPsDyVSddbVMfaEMCtX5PmW2NwFmveiM EOKLr72aSD30kD99QMxV2JT9bMZ2uW6ppBOlXPuSJWJCSO1FmX5ScbhMhHSbolaEM9 rOVmuMTKWGaLjUkFhN+hPlIk9745WDu2RTPoItxxj7ySY7DJWHMRQfsf6bcPtZ9Q7d KaCwP9sE3fkWOPq4pfJtbfBBPWJnkVvw3xQDjUObK6BpQnG3bmdOOmM9+dKOL8zQo5 oWeBbCTIsf7YAghYOhuVVigN7VSs0ruAfJyNQSeTJn5KdAFkNYcCgBiGYqwTsYJRoP p9a/32MzcsndQ== From: Alexey Gladkov To: Christian Brauner , Dan Klishch Cc: Al Viro , "Eric W . Biederman" , Kees Cook , containers@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Date: Mon, 13 Apr 2026 13:19:41 +0200 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Cache the mounters credentials and allow access to the net directories contingent of the permissions of the mounter of proc. Do not show /proc/self/net when proc is mounted with subset=3Dpid option and the mounter does not have CAP_NET_ADMIN. To avoid inadvertently allowing access to /proc//net, updating mounter credentials is not supported. Signed-off-by: Alexey Gladkov --- fs/proc/proc_net.c | 8 ++++++++ fs/proc/root.c | 2 ++ include/linux/proc_fs.h | 1 + 3 files changed, 11 insertions(+) diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c index 52f0b75cbce2..6e0ccef0169f 100644 --- a/fs/proc/proc_net.c +++ b/fs/proc/proc_net.c @@ -23,6 +23,7 @@ #include #include #include +#include =20 #include "internal.h" =20 @@ -270,6 +271,7 @@ static struct net *get_proc_task_net(struct inode *dir) struct task_struct *task; struct nsproxy *ns; struct net *net =3D NULL; + struct proc_fs_info *fs_info =3D proc_sb_info(dir->i_sb); =20 rcu_read_lock(); task =3D pid_task(proc_pid(dir), PIDTYPE_PID); @@ -282,6 +284,12 @@ static struct net *get_proc_task_net(struct inode *dir) } rcu_read_unlock(); =20 + if (net && (fs_info->pidonly =3D=3D PROC_PIDONLY_ON) && + security_capable(fs_info->mounter_cred, net->user_ns, CAP_NET_ADMIN, = CAP_OPT_NONE) < 0) { + put_net(net); + net =3D NULL; + } + return net; } =20 diff --git a/fs/proc/root.c b/fs/proc/root.c index 0f9100559471..6d18f9ee0375 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -254,6 +254,7 @@ static int proc_fill_super(struct super_block *s, struc= t fs_context *fc) return -ENOMEM; =20 fs_info->pid_ns =3D get_pid_ns(ctx->pid_ns); + fs_info->mounter_cred =3D get_cred(fc->cred); proc_apply_options(fs_info, fc, current_user_ns()); =20 /* User space would break if executables or devices appear on proc */ @@ -350,6 +351,7 @@ static void proc_kill_sb(struct super_block *sb) kill_anon_super(sb); if (fs_info) { put_pid_ns(fs_info->pid_ns); + put_cred(fs_info->mounter_cred); kfree_rcu(fs_info, rcu); } } diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h index 19d1c5e5f335..ec123c277d49 100644 --- a/include/linux/proc_fs.h +++ b/include/linux/proc_fs.h @@ -67,6 +67,7 @@ enum proc_pidonly { struct proc_fs_info { struct pid_namespace *pid_ns; kgid_t pid_gid; + const struct cred *mounter_cred; enum proc_hidepid hide_pid; enum proc_pidonly pidonly; struct rcu_head rcu; --=20 2.53.0 From nobody Fri Apr 17 00:23:15 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BC893C3C15; Mon, 13 Apr 2026 11:22:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079355; cv=none; b=kfzft3/O7V1+ADa0QAFm4JB0/wiS599/ftoTcfTpoUYkD3farJHmkr2EmBSGGb5VHbxbTMwxNtjWyg9FUyOR0JAJWBCQHwsl93yhphMtMcVkHp6OvNKfdMleCPik/8sXLTQo1b7nATCmSB0l49rQwre6FkZMd5grh5EXmQm6J/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079355; c=relaxed/simple; bh=L0acH809vek233P6WxGrNM6Yv7a2Wb2NzaBAvKvNzIc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kCIjmrHcncua81zKSzSjbZ4c1dkOqfXitwsXUN4nwEa2uGG+dLxSUmLZ6xGYGBpcNqZE9cYgJecgjfJe+abTUkADJTgaGJgGe8LoiPbO2I6+WjL3W8FeBBeyctU2cjTvGKztNhZCcjLvZc3l3i0q4Q17zYc8dfZbCEy0DOYU7Gk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ktv5OFQH; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ktv5OFQH" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34EDDC2BCB1; Mon, 13 Apr 2026 11:22:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776079355; bh=L0acH809vek233P6WxGrNM6Yv7a2Wb2NzaBAvKvNzIc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ktv5OFQHgLIvOic/9YObLbszJ7eX1bJWEVVfDeNUDQoAzyKLR3qWd1cSVhd8TWEnA E7p6f3ombm8oXl7EzVZJXQyL/E/OKFYZdTNl/P0tkn32b5IpSvj3g5n0xR4T52P93y BPmNpuCT86dVEf2apY7fG3tmrDBF0M8F4SK8o55HQOeOwadNlPEe30BYVUd+NqW2qf FQ56DsbrCqHB6+EGFFYTc465n2zStxSKX30ZxllnZfvqQVPN144P9CkZW7tIO6YBMq SfVrfjYCemlYV9FUHLh0z67+Ny1ZyuTtrpKEyGm1ObQav/Ayp+BSE1INT1gK4T0Uy9 AuGcI2NXQX9MQ== From: Alexey Gladkov To: Christian Brauner , Dan Klishch Cc: Al Viro , "Eric W . Biederman" , Kees Cook , containers@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 3/5] proc: Disable cancellation of subset=pid option Date: Mon, 13 Apr 2026 13:19:42 +0200 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When procfs is mounted with subset=3Dpid option, where is no way to remount it with this option removed. This is done in order not to make visible what ever was hidden since some checks occur during mount. This patch makes the limitation explicit and prints an error message. Signed-off-by: Alexey Gladkov --- fs/proc/root.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/fs/proc/root.c b/fs/proc/root.c index 6d18f9ee0375..05558654df31 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -223,7 +223,7 @@ static int proc_parse_param(struct fs_context *fc, stru= ct fs_parameter *param) return 0; } =20 -static void proc_apply_options(struct proc_fs_info *fs_info, +static int proc_apply_options(struct proc_fs_info *fs_info, struct fs_context *fc, struct user_namespace *user_ns) { @@ -233,13 +233,17 @@ static void proc_apply_options(struct proc_fs_info *f= s_info, fs_info->pid_gid =3D make_kgid(user_ns, ctx->gid); if (ctx->mask & (1 << Opt_hidepid)) fs_info->hide_pid =3D ctx->hidepid; - if (ctx->mask & (1 << Opt_subset)) + if (ctx->mask & (1 << Opt_subset)) { + if (ctx->pidonly !=3D PROC_PIDONLY_ON && fs_info->pidonly =3D=3D PROC_PI= DONLY_ON) + return invalf(fc, "proc: subset=3Dpid cannot be unset\n"); fs_info->pidonly =3D ctx->pidonly; + } if (ctx->mask & (1 << Opt_pidns) && !WARN_ON_ONCE(fc->purpose =3D=3D FS_CONTEXT_FOR_RECONFIGURE)) { put_pid_ns(fs_info->pid_ns); fs_info->pid_ns =3D get_pid_ns(ctx->pid_ns); } + return 0; } =20 static int proc_fill_super(struct super_block *s, struct fs_context *fc) @@ -255,7 +259,9 @@ static int proc_fill_super(struct super_block *s, struc= t fs_context *fc) =20 fs_info->pid_ns =3D get_pid_ns(ctx->pid_ns); fs_info->mounter_cred =3D get_cred(fc->cred); - proc_apply_options(fs_info, fc, current_user_ns()); + ret =3D proc_apply_options(fs_info, fc, current_user_ns()); + if (ret) + return ret; =20 /* User space would break if executables or devices appear on proc */ s->s_iflags |=3D SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; @@ -304,8 +310,7 @@ static int proc_reconfigure(struct fs_context *fc) =20 sync_filesystem(sb); =20 - proc_apply_options(fs_info, fc, current_user_ns()); - return 0; + return proc_apply_options(fs_info, fc, current_user_ns()); } =20 static int proc_get_tree(struct fs_context *fc) --=20 2.53.0 From nobody Fri Apr 17 00:23:15 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D9443C456F; Mon, 13 Apr 2026 11:22:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079357; cv=none; b=MhnCwWFCgP84S5jKdRV66f3WqmM863zOwvS0QEPEIg/j6nMgnutTXdw7+crvfUjZbqhsJ3KjVPtFjNRbgNtnbHz9tRhq2H6JWhCjIft4OJbJrmpLWSSLvI63DlNPd/7vr8wz359nWiX/R4jBPbb/6BqPiwrB72ppVpkT1YC5PWY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079357; c=relaxed/simple; bh=5hgEljYltABvDGV+0FITDfNO2kQKx19vmq3Iy9sLHxo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=afg8uJlYNLI4c/nKgwx+CZSD/gNblcIKOK5JglYYSrFG5liYrMl/ISxTMYSaLoaONbVb2gLDecA4FRGV8X50qqVCe8bozhHw8dYqcICt77e2txvvc3YtD8W93++FktMwV4ZXX6HlL7kc3Oo8VpCFPEGuzeluJVTuMdiHG8NnT18= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SZNuRsaL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SZNuRsaL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 79B0CC2BCB4; Mon, 13 Apr 2026 11:22:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776079357; bh=5hgEljYltABvDGV+0FITDfNO2kQKx19vmq3Iy9sLHxo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SZNuRsaLnEKtT4yp7vU90mNtnFgMjD4IEvJ/0lSwV+oSSAhVazhyxhuRn2rwUq5Zv /1N3dkNiW4xMW8iDKUinJv8rd7WQVhUeTIQzEciCUHXS1dAyTpGdXBZQUO2FO/RMTm P413DtnKQau3a4yb+X10dv1zHMbAeB21Nsu2UnBFZWJukPiafbUkfQEzkqJD9fvJ7G UcXzMUWU2hfKrrAavStK4rFwpjuxWDHwRlBBUGbQQjjsthgq3jArnw7MHWNd55WjAC D3YGVHWua1BOyeZgmJ2Qwjg7WZ/bAIFZXJIjhk8JPHAjVJl+gEcPijbiVlX1hcCwvV N4n1U6hINKNiQ== From: Alexey Gladkov To: Christian Brauner , Dan Klishch Cc: Al Viro , "Eric W . Biederman" , Kees Cook , containers@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used Date: Mon, 13 Apr 2026 13:19:43 +0200 Message-ID: <38572c1fb7cf55b4c27dd792adafa52f1216e3a3.1776079055.git.legion@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When procfs is mounted with the subset=3Dpid option, all system files and directories from the root of the filesystem are not accessible in userspace. Only dynamic information about processes is available, which cannot be hidden with overmount. For this reason, checking for full visibility is not relevant if mounting is performed with the subset=3Dpid option. Signed-off-by: Alexey Gladkov --- fs/fs_context.c | 1 + fs/namespace.c | 15 +++++++-------- fs/proc/root.c | 7 +++++++ include/linux/fs_context.h | 1 + 4 files changed, 16 insertions(+), 8 deletions(-) diff --git a/fs/fs_context.c b/fs/fs_context.c index a37b0a093505..2fd3d6422a38 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -545,6 +545,7 @@ void vfs_clean_context(struct fs_context *fc) kfree(fc->source); fc->source =3D NULL; fc->exclusive =3D false; + fc->skip_visibility =3D false; =20 fc->purpose =3D FS_CONTEXT_FOR_RECONFIGURE; fc->phase =3D FS_CONTEXT_AWAITING_RECONF; diff --git a/fs/namespace.c b/fs/namespace.c index 539b74403072..32aaedb020c1 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3755,7 +3755,7 @@ static int do_add_mount(struct mount *newmnt, const s= truct pinned_mountpoint *mp return graft_tree(newmnt, mp); } =20 -static bool mount_too_revealing(const struct super_block *sb, int *new_mnt= _flags); +static bool mount_too_revealing(struct fs_context *fc, int *new_mnt_flags); =20 /* * Create a new mount using a superblock configuration and request it @@ -3764,19 +3764,17 @@ static bool mount_too_revealing(const struct super_= block *sb, int *new_mnt_flags static int do_new_mount_fc(struct fs_context *fc, const struct path *mount= point, unsigned int mnt_flags) { - struct super_block *sb; struct vfsmount *mnt __free(mntput) =3D fc_mount(fc); int error; =20 if (IS_ERR(mnt)) return PTR_ERR(mnt); =20 - sb =3D fc->root->d_sb; - error =3D security_sb_kern_mount(sb); + error =3D security_sb_kern_mount(fc->root->d_sb); if (unlikely(error)) return error; =20 - if (unlikely(mount_too_revealing(sb, &mnt_flags))) { + if (unlikely(mount_too_revealing(fc, &mnt_flags))) { errorfcp(fc, "VFS", "Mount too revealing"); return -EPERM; } @@ -4463,7 +4461,7 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, fl= ags, return ret; =20 ret =3D -EPERM; - if (mount_too_revealing(fc->root->d_sb, &mnt_flags)) { + if (mount_too_revealing(fc, &mnt_flags)) { errorfcp(fc, "VFS", "Mount too revealing"); return ret; } @@ -6368,10 +6366,11 @@ static bool mnt_already_visible(struct mnt_namespac= e *ns, return false; } =20 -static bool mount_too_revealing(const struct super_block *sb, int *new_mnt= _flags) +static bool mount_too_revealing(struct fs_context *fc, int *new_mnt_flags) { const unsigned long required_iflags =3D SB_I_NOEXEC | SB_I_NODEV; struct mnt_namespace *ns =3D current->nsproxy->mnt_ns; + const struct super_block *sb =3D fc->root->d_sb; unsigned long s_iflags; =20 if (ns->user_ns =3D=3D &init_user_ns) @@ -6388,7 +6387,7 @@ static bool mount_too_revealing(const struct super_bl= ock *sb, int *new_mnt_flags return true; } =20 - return !mnt_already_visible(ns, sb, new_mnt_flags); + return (!fc->skip_visibility && !mnt_already_visible(ns, sb, new_mnt_flag= s)); } =20 bool mnt_may_suid(struct vfsmount *mnt) diff --git a/fs/proc/root.c b/fs/proc/root.c index 05558654df31..6dc870b3061b 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -263,6 +263,13 @@ static int proc_fill_super(struct super_block *s, stru= ct fs_context *fc) if (ret) return ret; =20 + /* + * The dynamic part of procfs cannot be hidden using overmount. + * Therefore, the check for "not fully visible" can be skipped. + */ + if (fs_info->pidonly) + fc->skip_visibility =3D true; + /* User space would break if executables or devices appear on proc */ s->s_iflags |=3D SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; s->s_flags |=3D SB_NODIRATIME | SB_NOSUID | SB_NOEXEC; diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 0d6c8a6d7be2..d80b77df2628 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -110,6 +110,7 @@ struct fs_context { bool global:1; /* Goes into &init_user_ns */ bool oldapi:1; /* Coming from mount(2) */ bool exclusive:1; /* create new superblock, reject existing one */ + bool skip_visibility:1; /* Skip visibility check for SB_I_USERNS_VISIBL= E */ }; =20 struct fs_context_operations { --=20 2.53.0 From nobody Fri Apr 17 00:23:15 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01E6C3C5555; Mon, 13 Apr 2026 11:22:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079360; cv=none; b=KhTGN7hNVsu+QCRTkmboxaXR0D35BJhzlXtvdufWO5gyy+seZ2uHUjXJG4P8/Rh69T8CrcmzTIe9gNmpYNSGkuw1LFdOsaM3e1eLA7toUyOHuqHHKvpylWyzWcfqwXKp5cF/sursiJzZ6zr5aVxZqM+hu/M8oMxs8xiWNLWl/Zw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776079360; c=relaxed/simple; bh=ih3F+e99A+RK4mWpStCjQYOR4dtgli7oebDrxx5J74Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qqv0q6wySzO7qZStLWLq7SxKH71SOPYi1oFmhZ5JWPxvKsA9df3KvjeY8y77VPW7HH1DqwvViO6fYxnxfis+m3zqqPF8MiL30TXTLaFGZe6Z6il3OeWteziJo6qU/b0kLvHt8BWkeJy7DsnJ8rrB+YwnWbIG1TN3c/yq0frl2GQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Gm/5jPJP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Gm/5jPJP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC8A1C2BCAF; Mon, 13 Apr 2026 11:22:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776079359; bh=ih3F+e99A+RK4mWpStCjQYOR4dtgli7oebDrxx5J74Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Gm/5jPJPJZs969fJ+wwOAZySR8Nohs2SH72Wf4o1B2Vlbh1JAiUCe9U3Q4QGjE4k/ zHfw/VkPRDkCaSxtPmGrK5KIZi1RplKgewkK6Az5+Ly72VwlDORXXm998LgJUoM3H+ I7pmwVVhgxWw2exdijaThEPHU3sSXRew6DDz2x4v3pEUB3hayIjDS+gqpSJ1uUBy3j vdH0R9FGfSg9DBTFBHM837l0hhPxL8xVZKOv5Q/WWFJ2sCPbzmrPiB60y0RVPNsLU9 6r1OT2owizJjV0SEtmxtFXASAdDfUGTQmoLZN+UgQbX1zJYrT0JxaIEZUilwnCRoeY 6PzMISg6Y0Qhg== From: Alexey Gladkov To: Christian Brauner , Dan Klishch Cc: Al Viro , "Eric W . Biederman" , Kees Cook , containers@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 5/5] docs: proc: add documentation about mount restrictions Date: Mon, 13 Apr 2026 13:19:44 +0200 Message-ID: <8d9734a83d9b85130485d76c3562064f836b7361.1776079055.git.legion@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" procfs has a number of mounting restrictions that are not documented anywhere. Signed-off-by: Alexey Gladkov --- Documentation/filesystems/proc.rst | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems= /proc.rst index b0c0d1b45b99..699cc372e6c8 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -52,6 +52,7 @@ fixes/update part 1.1 Stefani Seibold June 9 2009 =20 4 Configuring procfs 4.1 Mount options + 4.2 Mount restrictions =20 5 Filesystem behavior =20 @@ -2410,6 +2411,20 @@ will use the calling process's active pid namespace.= Note that the pid namespace of an existing procfs instance cannot be modified (attempting to= do so will give an `-EBUSY` error). =20 +4.2 Mount restrictions +-------------------------- + +If user namespaces are in use, the kernel additionally checks the instance= s of +procfs available to the mounter and will not allow procfs to be mounted if: + + 1. This mount is not fully visible unless the new procfs is going to be + mounted with subset=3Dpid option. + + a. It's root directory is not the root directory of the filesystem. + b. If any file or non-empty procfs directory is hidden by another mou= nt. + + 2. A new mount overrides the readonly option or any option from atime fa= milty. + Chapter 5: Filesystem behavior =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =20 --=20 2.53.0