net/socket.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can change
between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
`BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false" to
"true"
between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
`BPF_CGROUP_RUN_PROG_GETSOCKOPT`,
`BPF_CGROUP_RUN_PROG_GETSOCKOPT` will receive an -EFAULT from
`__cgroup_bpf_run_filter_getsockopt(max_optlen=0)` due to `get_user()`
had not reached in `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
Scenario shown as below:
`process A` `process B`
----------- ------------
BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
enable CGROUP_GETSOCKOPT
BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
Prevent `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` change between
`BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and `BPF_CGROUP_RUN_PROG_GETSOCKOPT`
by acquiring cgroup_lock.
Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
---
We have encountered this issue by observing that process A could sometimes
get an -EFAULT from getsockopt() during our device boot-up, while another
process B triggers the race condition by enabling CGROUP_GETSOCKOPT
through bpf syscall at the same time.
The race condition is shown below:
`process A` `process B`
----------- ------------
BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
bpf syscall
(CGROUP_GETSOCKOPT enabled)
BPF_CGROUP_RUN_PROG_GETSOCKOPT
-> __cgroup_bpf_run_filter_getsockopt
(-EFAULT)
__cgroup_bpf_run_filter_getsockopt return -EFAULT at the line shown below:
if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
if (orig_optlen > PAGE_SIZE && ctx.optlen >= 0) {
pr_info_once("bpf getsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
ctx.optlen, max_optlen);
ret = retval;
goto out;
}
ret = -EFAULT; <== return EFAULT here
goto out;
}
This patch should fix the race but not sure if it introduces any potential
side effects or regression.
And we wondering if this is a real issue in do_sock_getsockopt or if
getsockopt() is designed to expect such race conditions.
Should the userspace caller always anticipate an -EFAULT from getsockopt()
if another process enables CGROUP_GETSOCKOPT at the same time?
Any comment will be appreciated!
BTW, I added Chengjui and Yanghui to Co-developed due to we have several
discussions on this issue. And we both spend some time on this issue.
---
net/socket.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/socket.c b/net/socket.c
index fcbdd5bc47ac..e0b2b16fd238 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2370,8 +2370,10 @@ int do_sock_getsockopt(struct socket *sock, bool compat, int level,
if (err)
return err;
- if (!compat)
+ if (!compat) {
+ cgroup_lock();
max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
+ }
ops = READ_ONCE(sock->ops);
if (level == SOL_SOCKET) {
@@ -2387,10 +2389,12 @@ int do_sock_getsockopt(struct socket *sock, bool compat, int level,
optlen.user);
}
- if (!compat)
+ if (!compat) {
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
optval, optlen, max_optlen,
err);
+ cgroup_unlock();
+ }
return err;
}
--
2.45.2
On Mon, Aug 19, 2024 at 10:27 AM Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:
>
> The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can change
> between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> `BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
>
> If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false" to
> "true"
> between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> `BPF_CGROUP_RUN_PROG_GETSOCKOPT`,
> `BPF_CGROUP_RUN_PROG_GETSOCKOPT` will receive an -EFAULT from
> `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)` due to `get_user()`
> had not reached in `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
>
> Scenario shown as below:
>
> `process A` `process B`
> ----------- ------------
> BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
> enable CGROUP_GETSOCKOPT
> BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
>
> Prevent `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` change between
> `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and `BPF_CGROUP_RUN_PROG_GETSOCKOPT`
> by acquiring cgroup_lock.
>
> Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
> Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
> Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
>
> ---
>
> We have encountered this issue by observing that process A could sometimes
> get an -EFAULT from getsockopt() during our device boot-up, while another
> process B triggers the race condition by enabling CGROUP_GETSOCKOPT
> through bpf syscall at the same time.
>
> The race condition is shown below:
>
> `process A` `process B`
> ----------- ------------
> BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
>
> bpf syscall
> (CGROUP_GETSOCKOPT enabled)
>
> BPF_CGROUP_RUN_PROG_GETSOCKOPT
> -> __cgroup_bpf_run_filter_getsockopt
> (-EFAULT)
>
> __cgroup_bpf_run_filter_getsockopt return -EFAULT at the line shown below:
> if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
> if (orig_optlen > PAGE_SIZE && ctx.optlen >= 0) {
> pr_info_once("bpf getsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
> ctx.optlen, max_optlen);
> ret = retval;
> goto out;
> }
> ret = -EFAULT; <== return EFAULT here
> goto out;
> }
>
> This patch should fix the race but not sure if it introduces any potential
> side effects or regression.
>
> And we wondering if this is a real issue in do_sock_getsockopt or if
> getsockopt() is designed to expect such race conditions.
> Should the userspace caller always anticipate an -EFAULT from getsockopt()
> if another process enables CGROUP_GETSOCKOPT at the same time?
>
> Any comment will be appreciated!
>
> BTW, I added Chengjui and Yanghui to Co-developed due to we have several
> discussions on this issue. And we both spend some time on this issue.
>
> ---
> net/socket.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/socket.c b/net/socket.c
> index fcbdd5bc47ac..e0b2b16fd238 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -2370,8 +2370,10 @@ int do_sock_getsockopt(struct socket *sock, bool compat, int level,
> if (err)
> return err;
>
> - if (!compat)
> + if (!compat) {
> + cgroup_lock();
> max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
> + }
>
Acquiring cgroup_lock mutex in socket getsockopt() fast path ?
There is no way we can accept such a patch, please come up with a
reasonable patch.
cgroup_bpf_enabled() should probably be used once.
© 2016 - 2026 Red Hat, Inc.