include/linux/group_range.h | 18 +++++ include/linux/user_namespace.h | 5 ++ include/net/netns/ipv4.h | 9 +-- include/net/ping.h | 6 -- kernel/Makefile | 2 +- kernel/fork.c | 24 +++++++ kernel/group_range.c | 123 +++++++++++++++++++++++++++++++++ kernel/sysctl.c | 30 ++++++++ kernel/user.c | 9 +++ net/ipv4/ping.c | 39 +---------- net/ipv4/sysctl_net_ipv4.c | 56 ++------------- 11 files changed, 219 insertions(+), 102 deletions(-) create mode 100644 include/linux/group_range.h create mode 100644 kernel/group_range.c
This sysctl limits groups who can create a new userns without CAP_SYS_ADMIN in the current userns, so as to mitigate potential kernel vulnerabilities around userns. The sysctl value format is same as "net.ipv4.ping_group_range". To disable creating new unprivileged userns, set the sysctl value to "1 0" in the initial userns. To allow everyone to create new userns, set the sysctl value to "0 4294967294". This is the default value. This sysctl replaces "kernel.unprivileged_userns_clone" that is found in Ubuntu [1] and Debian GNU/Linux. Link: https://git.launchpad.net/~ubuntu- kernel/ubuntu/+source/linux/+git/jammy/commit?id=3422764 [1] Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp> Akihiro Suda (3): net/ipv4: split group_range logic to kernel/group_range.c group_range: allow GID from 2147483648 to 4294967294 userns: add sysctl "kernel.userns_group_range" include/linux/group_range.h | 18 +++++ include/linux/user_namespace.h | 5 ++ include/net/netns/ipv4.h | 9 +-- include/net/ping.h | 6 -- kernel/Makefile | 2 +- kernel/fork.c | 24 +++++++ kernel/group_range.c | 123 +++++++++++++++++++++++++++++++++ kernel/sysctl.c | 30 ++++++++ kernel/user.c | 9 +++ net/ipv4/ping.c | 39 +---------- net/ipv4/sysctl_net_ipv4.c | 56 ++------------- 11 files changed, 219 insertions(+), 102 deletions(-) create mode 100644 include/linux/group_range.h create mode 100644 kernel/group_range.c -- 2.38.4
~akihirosuda <akihirosuda@git.sr.ht> writes: > This sysctl limits groups who can create a new userns without > CAP_SYS_ADMIN in the current userns, so as to mitigate potential kernel > vulnerabilities around userns. > > The sysctl value format is same as "net.ipv4.ping_group_range". > > To disable creating new unprivileged userns, set the sysctl value to "1 > 0" in the initial userns. > > To allow everyone to create new userns, set the sysctl value to "0 > 4294967294". This is the default value. > > This sysctl replaces "kernel.unprivileged_userns_clone" that is found in > Ubuntu [1] and Debian GNU/Linux. > > Link: https://git.launchpad.net/~ubuntu- > kernel/ubuntu/+source/linux/+git/jammy/commit?id=3422764 [1] > > Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp> How does this functionally differ from what already exists user.max_user_namespaces? Given that setns exists I don't see limiting creation of user namespaces by group being meaningful, if your goal is to reduce the attack surface of the kernel to mitigate potential kernel vulnerabilities. How does this functionality interact with the use of setgroups in a user namespace? What is the value of a group_range inside of a newly created user namespace? How does that work to maintain the policy you are trying to implement? Eric
On Tue, May 30, 2023 at 2:50 PM ~akihirosuda <akihirosuda@git.sr.ht> wrote:
>
> This sysctl limits groups who can create a new userns without
> CAP_SYS_ADMIN in the current userns, so as to mitigate potential kernel
> vulnerabilities around userns.
>
> The sysctl value format is same as "net.ipv4.ping_group_range".
>
> To disable creating new unprivileged userns, set the sysctl value to "1
> 0" in the initial userns.
>
> To allow everyone to create new userns, set the sysctl value to "0
> 4294967294". This is the default value.
>
> This sysctl replaces "kernel.unprivileged_userns_clone" that is found in
> Ubuntu [1] and Debian GNU/Linux.
>
> Link: https://git.launchpad.net/~ubuntu-
> kernel/ubuntu/+source/linux/+git/jammy/commit?id=3422764 [1]
Given the challenges around adding access controls to userns
operations, have you considered using the LSM support that was added
upstream last year? The relevant LSM hook can be found in commit
7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
and although only SELinux currently provides an access control
implementation, there is no reason you couldn't add support for your
favorite LSM, or even just a simple BPF LSM to enforce the group
controls as you've described them here.
> Akihiro Suda (3):
> net/ipv4: split group_range logic to kernel/group_range.c
> group_range: allow GID from 2147483648 to 4294967294
> userns: add sysctl "kernel.userns_group_range"
>
> include/linux/group_range.h | 18 +++++
> include/linux/user_namespace.h | 5 ++
> include/net/netns/ipv4.h | 9 +--
> include/net/ping.h | 6 --
> kernel/Makefile | 2 +-
> kernel/fork.c | 24 +++++++
> kernel/group_range.c | 123 +++++++++++++++++++++++++++++++++
> kernel/sysctl.c | 30 ++++++++
> kernel/user.c | 9 +++
> net/ipv4/ping.c | 39 +----------
> net/ipv4/sysctl_net_ipv4.c | 56 ++-------------
> 11 files changed, 219 insertions(+), 102 deletions(-)
> create mode 100644 include/linux/group_range.h
> create mode 100644 kernel/group_range.c
--
paul-moore.com
Paul Moore <paul@paul-moore.com> writes:
> On Tue, May 30, 2023 at 2:50 PM ~akihirosuda <akihirosuda@git.sr.ht> wrote:
>>
>> This sysctl limits groups who can create a new userns without
>> CAP_SYS_ADMIN in the current userns, so as to mitigate potential kernel
>> vulnerabilities around userns.
>>
>> The sysctl value format is same as "net.ipv4.ping_group_range".
>>
>> To disable creating new unprivileged userns, set the sysctl value to "1
>> 0" in the initial userns.
>>
>> To allow everyone to create new userns, set the sysctl value to "0
>> 4294967294". This is the default value.
>>
>> This sysctl replaces "kernel.unprivileged_userns_clone" that is found in
>> Ubuntu [1] and Debian GNU/Linux.
>>
>> Link: https://git.launchpad.net/~ubuntu-
>> kernel/ubuntu/+source/linux/+git/jammy/commit?id=3422764 [1]
>
> Given the challenges around adding access controls to userns
> operations, have you considered using the LSM support that was added
> upstream last year? The relevant LSM hook can be found in commit
> 7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
Paul how have you handled the real world regression I reported against
chromium?
Paul are you aware that the LSM hook can not be used to achieve the
objective of this patchset?
> and although only SELinux currently provides an access control
> implementation, there is no reason you couldn't add support for your
> favorite LSM, or even just a simple BPF LSM to enforce the group
> controls as you've described them here.
Is there a publicly available SELinux policy that uses that LSM hook?
Eric
On Thu, Jun 1, 2023 at 8:14 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> Paul Moore <paul@paul-moore.com> writes:
> > On Tue, May 30, 2023 at 2:50 PM ~akihirosuda <akihirosuda@git.sr.ht> wrote:
> >>
> >> This sysctl limits groups who can create a new userns without
> >> CAP_SYS_ADMIN in the current userns, so as to mitigate potential kernel
> >> vulnerabilities around userns.
> >>
> >> The sysctl value format is same as "net.ipv4.ping_group_range".
> >>
> >> To disable creating new unprivileged userns, set the sysctl value to "1
> >> 0" in the initial userns.
> >>
> >> To allow everyone to create new userns, set the sysctl value to "0
> >> 4294967294". This is the default value.
> >>
> >> This sysctl replaces "kernel.unprivileged_userns_clone" that is found in
> >> Ubuntu [1] and Debian GNU/Linux.
> >>
> >> Link: https://git.launchpad.net/~ubuntu-
> >> kernel/ubuntu/+source/linux/+git/jammy/commit?id=3422764 [1]
> >
> > Given the challenges around adding access controls to userns
> > operations, have you considered using the LSM support that was added
> > upstream last year? The relevant LSM hook can be found in commit
> > 7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
>
> Paul how have you handled the real world regression I reported against
> chromium?
I don't track chromium development.
> Paul are you aware that the LSM hook can not be used to achieve the
> objective of this patchset?
/me shrugs
I thought one could look into a cred struct using a BPF LSM, which
would allow one to make access control decisions based on group ID,
but I will be the first to admit I'm not a BPF LSM expert.
Regardless, one could introduce a group ID check into a LSM if they
were so inclined.
I also find it slightly amusing that you are arguing against my reply
that was discussing *not* adding another userns control point; of all
people I thought you would be supportive of this ... /me shrugs again.
> > and although only SELinux currently provides an access control
> > implementation, there is no reason you couldn't add support for your
> > favorite LSM, or even just a simple BPF LSM to enforce the group
> > controls as you've described them here.
>
> Is there a publicly available SELinux policy that uses that LSM hook?
I have no idea, I don't track all of the publicly available SELinux
policies because frankly it doesn't matter; the SELinux feature
exists, and it is my role to support and maintain it. There are
LSM/SELinux features which are not widely exercised in general purpose
SELinux policies for various reasons, but *are* used in specialized
environments that are not often discussed on public mailing lists.
--
paul-moore.com
Paul Moore <paul@paul-moore.com> writes:
> On Thu, Jun 1, 2023 at 8:14 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> Paul Moore <paul@paul-moore.com> writes:
>> >
>> > Given the challenges around adding access controls to userns
>> > operations, have you considered using the LSM support that was added
>> > upstream last year? The relevant LSM hook can be found in commit
>> > 7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
>>
>> Paul how have you handled the real world regression I reported against
>> chromium?
>
> I don't track chromium development.
You have chosen to be the maintainer and I reported it to you.
>> Paul are you aware that the LSM hook can not be used to achieve the
>> objective of this patchset?
>
> /me shrugs
>
[snip parts about performing a group id check]
The LSM hook you added does not have the technical capability to reduce
the attack surface to mitigate bugs in the kernel. It is the
ineffectiveness of the hook not the permission check that I was
referring to.
Eric
On Thu, Jun 1, 2023 at 9:41 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> Paul Moore <paul@paul-moore.com> writes:
> > On Thu, Jun 1, 2023 at 8:14 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> Paul Moore <paul@paul-moore.com> writes:
> >> >
> >> > Given the challenges around adding access controls to userns
> >> > operations, have you considered using the LSM support that was added
> >> > upstream last year? The relevant LSM hook can be found in commit
> >> > 7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
> >>
> >> Paul how have you handled the real world regression I reported against
> >> chromium?
> >
> > I don't track chromium development.
>
> You have chosen to be the maintainer and I reported it to you.
I just dug through all of the mail I've received from you over the
past two (?) years, as well as checking the LSM archive on lore and I
don't see any bug reports from you directed at the upstream LSM or
SELinux code ... perhaps I missed something, do you have a pointer?
Also, for the sake of clarification, I do not maintain any part of
Chromium or Chrome OS. I do maintain the upstream LSM, SELinux,
audit, and labeled networking subsystems in the Linux Kernel as well
as a couple of userspace packages.
> >> Paul are you aware that the LSM hook can not be used to achieve the
> >> objective of this patchset?
> >
> > /me shrugs
>
> [snip parts about performing a group id check]
My comments here were only discussing the possibility of performing a
group ID based access control check; I made no claims about the
desirability of such a check, and I have no interest in rehashing our
old debates.
--
paul-moore.com
Thank you all for feedbacks,
> (Paul)
> Given the challenges around adding access controls to userns
> operations, have you considered using the LSM support that was added
> upstream last year?
I'll consider this.
> The relevant LSM hook can be found in commit
> 7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
> and although only SELinux currently provides an access control
> implementation, there is no reason you couldn't add support for your
> favorite LSM, or even just a simple BPF LSM to enforce the group
> controls as you've described them here.
My intent is to provide an universal knob that works for both SELinux
distros and AppArmor distros.
So I guess I should try to use BPF LSM (and find out how its end-user
UX in the userspace can be simplified just like sysctl).
---
> (Christian)
> Yes. Please, no more sysctls...
I'll try to find another way, such as (BPF) LSM.
---
> (Eric)
> How does this functionally differ from what already exists
> user.max_user_namespaces?
My patch is not about the numbers of the UserNS, but about who is
permitted to create UserNS,
which can be a potential attack surface to pwn the root in the initial UserNS.
> Given that setns exists I don't see limiting creation of user namespaces
> by group being meaningful, if your goal is to reduce the attack surface
> of the kernel to mitigate potential kernel vulnerabilities.
Yes, that's my goal.
The intent is to allow creating UserNS only for (semi-trusted) human
users who need to run unprivileged (aka rootless) containers.
Creating UserNS is expected to be prohibited for system daemon
accounts and human users who do not need (or who are not trusted
enough) to run containers.
This configuration should be more secure than just allowing everybody
to run unprivileged (aka rootless) containers, and also more secure
than just disabling UserNS and running everything as the root.
> How does this functionality interact with the use of setgroups in a user
> namespace?
>
> What is the value of a group_range inside of a newly created user
> namespace? How does that work to maintain the policy you are trying to
> implement?
In a child UserNS, the group_range file is expected to use the mapped
IDs, not the initial UserNS IDs.
(So, the range can't be just initialized with `.range = {0,
((gid_t)~0U) - 1}`. My patch v1 is wrong.)
> Paul are you aware that the LSM hook can not be used to achieve the
> objective of this patchset?
What would be an obstacle to using an LSM hook for this? (with an
addition of GID checks)
2023年6月2日(金) 23:50 Paul Moore <paul@paul-moore.com>:
>
> On Thu, Jun 1, 2023 at 9:41 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> > Paul Moore <paul@paul-moore.com> writes:
> > > On Thu, Jun 1, 2023 at 8:14 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> > >> Paul Moore <paul@paul-moore.com> writes:
> > >> >
> > >> > Given the challenges around adding access controls to userns
> > >> > operations, have you considered using the LSM support that was added
> > >> > upstream last year? The relevant LSM hook can be found in commit
> > >> > 7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
> > >>
> > >> Paul how have you handled the real world regression I reported against
> > >> chromium?
> > >
> > > I don't track chromium development.
> >
> > You have chosen to be the maintainer and I reported it to you.
>
> I just dug through all of the mail I've received from you over the
> past two (?) years, as well as checking the LSM archive on lore and I
> don't see any bug reports from you directed at the upstream LSM or
> SELinux code ... perhaps I missed something, do you have a pointer?
>
> Also, for the sake of clarification, I do not maintain any part of
> Chromium or Chrome OS. I do maintain the upstream LSM, SELinux,
> audit, and labeled networking subsystems in the Linux Kernel as well
> as a couple of userspace packages.
>
> > >> Paul are you aware that the LSM hook can not be used to achieve the
> > >> objective of this patchset?
> > >
> > > /me shrugs
> >
> > [snip parts about performing a group id check]
>
> My comments here were only discussing the possibility of performing a
> group ID based access control check; I made no claims about the
> desirability of such a check, and I have no interest in rehashing our
> old debates.
>
> --
> paul-moore.com
On Tue, May 30, 2023 at 05:58:48PM -0400, Paul Moore wrote:
> On Tue, May 30, 2023 at 2:50 PM ~akihirosuda <akihirosuda@git.sr.ht> wrote:
> >
> > This sysctl limits groups who can create a new userns without
> > CAP_SYS_ADMIN in the current userns, so as to mitigate potential kernel
> > vulnerabilities around userns.
> >
> > The sysctl value format is same as "net.ipv4.ping_group_range".
> >
> > To disable creating new unprivileged userns, set the sysctl value to "1
> > 0" in the initial userns.
> >
> > To allow everyone to create new userns, set the sysctl value to "0
> > 4294967294". This is the default value.
> >
> > This sysctl replaces "kernel.unprivileged_userns_clone" that is found in
> > Ubuntu [1] and Debian GNU/Linux.
> >
> > Link: https://git.launchpad.net/~ubuntu-
> > kernel/ubuntu/+source/linux/+git/jammy/commit?id=3422764 [1]
>
> Given the challenges around adding access controls to userns
> operations, have you considered using the LSM support that was added
> upstream last year? The relevant LSM hook can be found in commit
> 7cd4c5c2101c ("security, lsm: Introduce security_create_user_ns()"),
> and although only SELinux currently provides an access control
> implementation, there is no reason you couldn't add support for your
> favorite LSM, or even just a simple BPF LSM to enforce the group
> controls as you've described them here.
Yes. Please, no more sysctls...
© 2016 - 2026 Red Hat, Inc.