Documentation/admin-guide/sysctl/fs.rst | 12 ++++++++++++ fs/Kconfig | 8 ++++++++ fs/namespace.c | 21 ++++++++++++++++++++- 3 files changed, 40 insertions(+), 1 deletion(-)
Idmapped mounts may have security implications [1] and have
no knobs to be disallowed at runtime or compile time.
This patch adds a sysctl and a config option to set its default value.
[1] https://lore.kernel.org/all/m18s7481xc.fsf@fess.ebiederm.org/
Based on work from Alexey Gladkov <legion@kernel.org>.
Signed-off-by: Anton V. Boyarshinov <boyarsh@altlinux.org>
---
Documentation/admin-guide/sysctl/fs.rst | 12 ++++++++++++
fs/Kconfig | 8 ++++++++
fs/namespace.c | 21 ++++++++++++++++++++-
3 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
index 2a501c9ddc55..f758c4ae5f66 100644
--- a/Documentation/admin-guide/sysctl/fs.rst
+++ b/Documentation/admin-guide/sysctl/fs.rst
@@ -105,6 +105,18 @@ you have some awesome number of simultaneous system users,
you might want to raise the limit.
+idmap_mounts
+------------
+
+Idmapped mounts may have security implications.
+This knob controls whether creation of idmapped mounts is allowed.
+When set to "1", creation of idmapped mounts is allowed.
+When set to "0", creation of idmapped mounts is not allowed.
+
+The default value is
+* 0, if ``IDMAP_MOUNTS_DEFAULT_OFF`` is enabled in the kernel configuration;
+* 1, otherwise.
+
file-max & file-nr
------------------
diff --git a/fs/Kconfig b/fs/Kconfig
index 7a2b11c0b803..d2203ba0183d 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -385,4 +385,12 @@ source "fs/unicode/Kconfig"
config IO_WQ
bool
+config IDMAP_MOUNTS_DEFAULT_OFF
+ bool "Disallow idmappad mounts by default"
+ help
+ Idmapped mounts may have security implications.
+ Enable this to disallow idmapped mounts by setting
+ the default value of /proc/sys/fs/idmap_mounts to 0.
+
+
endmenu
diff --git a/fs/namespace.c b/fs/namespace.c
index 40b994a29e90..66501ad75537 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -39,6 +39,10 @@
/* Maximum number of mounts in a mount namespace */
static unsigned int sysctl_mount_max __read_mostly = 100000;
+/* Whether idmapped mounts are allowed. */
+static int sysctl_idmap_mounts __read_mostly =
+ IS_ENABLED(CONFIG_IDMAP_MOUNTS_DEFAULT_OFF) ? 0 : 1;
+
static unsigned int m_hash_mask __read_mostly;
static unsigned int m_hash_shift __read_mostly;
static unsigned int mp_hash_mask __read_mostly;
@@ -3965,7 +3969,13 @@ static int can_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
if (!is_anon_ns(mnt->mnt_ns))
return -EINVAL;
- return 0;
+ /* So far, there are concerns about the safety of idmaps. */
+ if (!sysctl_idmap_mounts) {
+ pr_warn_once("VFS: idmapped mounts are not allowed.\n");
+ return -EPERM;
+ } else {
+ return 0;
+ }
}
static struct mount *mount_setattr_prepare(struct mount_kattr *kattr,
@@ -4631,6 +4641,15 @@ static struct ctl_table fs_namespace_sysctls[] = {
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ONE,
},
+ {
+ .procname = "idmap_mounts",
+ .data = &sysctl_idmap_mounts,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
{ }
};
--
2.33.0
On Fri, Feb 04, 2022 at 09:53:38AM +0300, Anton V. Boyarshinov wrote: > Idmapped mounts may have security implications [1] and have > no knobs to be disallowed at runtime or compile time. > > This patch adds a sysctl and a config option to set its default value. > > [1] https://lore.kernel.org/all/m18s7481xc.fsf@fess.ebiederm.org/ > > Based on work from Alexey Gladkov <legion@kernel.org>. > > Signed-off-by: Anton V. Boyarshinov <boyarsh@altlinux.org> > --- Thank your for the general idea, Anton. If you want to turn off idmapped mounts you can already do so today via: echo 0 > /proc/sys/user/max_user_namespaces Aside from that, idmapped mounts can only be created by fully privileged users on the host for a selected number of filesystems. They can neither be created as an unprivileged user nor can they be created inside user namespaces. I appreciate the worry. Any new feature may have security implications and bugs. In addition, we did address these allegations multiple times already (see [1], [2], [3], [4], [5]). As the author/maintainer of this feature, Nacked-by: Christian Brauner <brauner@kernel.org> [1]: https://lore.kernel.org/lkml/20210213130042.828076-1-christian.brauner@ubuntu.com/T/#m3a9df31aa183e8797c70bc193040adfd601399ad [2]: https://lore.kernel.org/lkml/m1r1ifzf8x.fsf@fess.ebiederm.org [3]: https://lore.kernel.org/lkml/20210213130042.828076-1-christian.brauner@ubuntu.com/T/#m59cdad9630d5a279aeecd0c1f117115144bc15eb [4]: https://lore.kernel.org/lkml/20210510125147.tkgeurcindldiwxg@wittgenstein [5]: https://lore.kernel.org/linux-fsdevel/CAHrFyr4AYi_gad7LQ-cJ9Peg=Gt73Sded8k_ZHeRZz=faGzpQA@mail.gmail.com
В Fri, 4 Feb 2022 10:45:15 +0100 Christian Brauner <brauner@kernel.org> пишет: > If you want to turn off idmapped mounts you can already do so today via: > echo 0 > /proc/sys/user/max_user_namespaces It turns off much more than idmapped mounts only. More fine grained control seems better for me. > They can neither > be created as an unprivileged user nor can they be created inside user > namespaces. But actions of fully privileged user can open non-obvious ways to privilege escalation.
On Fri, Feb 04, 2022 at 01:26:16PM +0300, Anton V. Boyarshinov wrote: > В Fri, 4 Feb 2022 10:45:15 +0100 > Christian Brauner <brauner@kernel.org> пишет: > > > If you want to turn off idmapped mounts you can already do so today via: > > echo 0 > /proc/sys/user/max_user_namespaces > > It turns off much more than idmapped mounts only. More fine grained > control seems better for me. If you allow user namespaces and not idmapped mounts you haven't reduced your attack surface. An unprivileged user can reach much more exploitable code simply via unshare -user --map-root -mount which we still allow upstream without a second thought even with all the past and present exploits (see https://www.openwall.com/lists/oss-security/2022/01/29/1 for a current one from this January). > > > They can neither > > be created as an unprivileged user nor can they be created inside user > > namespaces. > > But actions of fully privileged user can open non-obvious ways to > privilege escalation. A fully privileged user potentially being able to cause issues is really not an argument; especially not for a new sysctl. You need root to create idmapped mounts and you need root to turn off the new knob. It also trivially applies to a whole slew of even basic kernel tunables basically everything that can be reached by unprivileged users after a privileged user has turned it on or configured it. After 2 years we haven't seen any issue with this code and while I'm not promising that there won't ever be issues - nobody can do that - the pure suspicion that there could be some is not a justification for anything.
В Fri, 4 Feb 2022 16:10:32 +0100 Christian Brauner <brauner@kernel.org> пишет: > > It turns off much more than idmapped mounts only. More fine grained > > control seems better for me. > > If you allow user namespaces and not idmapped mounts you haven't reduced > your attack surface. I have. And many other people have. People who have creating user namespaces by unpriviliged user disabled. I find it sad that we have no tool in mainline kernel to limit users access to creating user namespaces except complete disabling them. But many distros have that tools. Different tools with different interfaces and semantics :( And at least one major GNU/Linux distro disabled idmapped mounts unconditionally. If I were the author of this functionality, I would prefer to have a knob then have it unavailible for for so many users. But as you wish. > An unprivileged user can reach much more > exploitable code simply via unshare -user --map-root -mount which we > still allow upstream without a second thought even with all the past and > present exploits (see > https://www.openwall.com/lists/oss-security/2022/01/29/1 for a current > one from this January). > > > > > > They can neither > > > be created as an unprivileged user nor can they be created inside user > > > namespaces. > > > > But actions of fully privileged user can open non-obvious ways to > > privilege escalation. > > A fully privileged user potentially being able to cause issues is really > not an argument; especially not for a new sysctl. > You need root to create idmapped mounts and you need root to turn off > the new knob. > > It also trivially applies to a whole slew of even basic kernel tunables > basically everything that can be reached by unprivileged users after a > privileged user has turned it on or configured it. > > After 2 years we haven't seen any issue with this code and while I'm not > promising that there won't ever be issues - nobody can do that - the > pure suspicion that there could be some is not a justification for > anything.
On Sat, 2022-02-05 at 10:57 +0300, Anton V. Boyarshinov wrote: > В Fri, 4 Feb 2022 16:10:32 +0100 > Christian Brauner <brauner@kernel.org> пишет: > > > > > It turns off much more than idmapped mounts only. More fine > > > grained control seems better for me. > > > > If you allow user namespaces and not idmapped mounts you haven't > > reduced your attack surface. > > I have. And many other people have. People who have creating user > namespaces by unpriviliged user disabled. Which would defeat the purpose of user namespaces which is to allow the creation of unprivileged containers by anyone and allow us to reduce the container attack surface by reducing the actual privilege given to some real world containers. You've raised vague, unactionable security concerns about this, but basically one of the jobs of user namespaces is to take some designated features guarded by CAP_SYS_ADMIN and give the admin of the namespace (the unprivileged user) access to them. There are always going to be vague security concerns about doing this. If you had an actual, actionable concern, we could fix it. What happens without this is that containers that need the functionality now have to run with real root inside, which is a massively bigger security problem. Adding knobs to disable features for unactionable security concerns gives a feel good in terms of security theatre, but it causes system unpredictability in that any given application now has to check if a feature is usable before it uses it and figure out what to do if it isn't available. The more we do it, the bigger the combinatoric explosion of possible missing features and every distro ends up having a different default combination. The bottom line is it's much better to find and fix actual security bugs than create a runtime configuration nightmare. > I find it sad that we have no tool in mainline kernel to limit users > access to creating user namespaces except complete disabling them. > But many distros have that tools. Different tools with different > interfaces and semantics :( Have you actually proposed something? A more granular approach to globally disabling user namespaces might be acceptable provided it doesn't lead to a feature configuration explosion. James
On Sat, Feb 05, 2022 at 10:57:58AM +0300, Anton V. Boyarshinov wrote: > В Fri, 4 Feb 2022 16:10:32 +0100 > Christian Brauner <brauner@kernel.org> пишет: > > > > > It turns off much more than idmapped mounts only. More fine grained > > > control seems better for me. > > > > If you allow user namespaces and not idmapped mounts you haven't reduced > > your attack surface. > > I have. And many other people have. People who have creating user > namespaces by unpriviliged user disabled. I find it sad that we have no > tool in mainline kernel to limit users access to creating user > namespaces except complete disabling them. But many distros have that > tools. Different tools with different interfaces and semantics :( > > And at least one major GNU/Linux distro disabled idmapped mounts > unconditionally. If I were the author of this functionality, I would > prefer to have a knob then have it unavailible for for so many users. But as you wish. You're talking about the author of the allegations being involved in disabling idmapped mounts for rhel under [2] as I was told. If a downstream distro wants to disable this feature based on allegations we've refuted multiple times then we can't stop them from doing so. The only disconcerting thing is that this helps spreads misinformation as evidenced by this patch. The allegations and refutation around them are all visible and I've linked to them in the initial reply. This is a root-only accessible feature with a massive testsuite and being used for 2 years. Each bug fixed gets its own regression test right away. We will of course take and upstream patches that fix actual clearly reported bugs. In the end it is not different from say Archlinux [1] having had user namespaces disabled for 5+ years from their introduction in 2013 onwards and many other examples. Downstream distros can make whatever choice they want and diverge from upstream. In any case, I'll be on vacation for about 2 weeks with very limited access to internet going forward. [1]: https://bugs.archlinux.org/task/36969 [2]: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/131 > > > An unprivileged user can reach much more > > exploitable code simply via unshare -user --map-root -mount which we > > still allow upstream without a second thought even with all the past and > > present exploits (see > > https://www.openwall.com/lists/oss-security/2022/01/29/1 for a current > > one from this January). > > > > > > > > > They can neither > > > > be created as an unprivileged user nor can they be created inside user > > > > namespaces. > > > > > > But actions of fully privileged user can open non-obvious ways to > > > privilege escalation. > > > > A fully privileged user potentially being able to cause issues is really > > not an argument; especially not for a new sysctl. > > You need root to create idmapped mounts and you need root to turn off > > the new knob. > > > > It also trivially applies to a whole slew of even basic kernel tunables > > basically everything that can be reached by unprivileged users after a > > privileged user has turned it on or configured it. > > > > After 2 years we haven't seen any issue with this code and while I'm not > > promising that there won't ever be issues - nobody can do that - the > > pure suspicion that there could be some is not a justification for > > anything. > >
© 2016 - 2026 Red Hat, Inc.