net/mptcp/ctrl.c | 17 +++++++++-------- net/rds/tcp.c | 39 ++++++++++++++++++++++++++++++++------- net/sctp/sysctl.c | 14 ++++++++------ 3 files changed, 49 insertions(+), 21 deletions(-)
As pointed out by Al Viro and Eric Dumazet in [1], using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' or 'pernet' structure can be obtained from the table->data
using container_of().
Note that table->data could also be used directly in more places, but
that would increase the size of this fix to replace all accesses via
'net'.
Patches 2-9 remove access of net via current->nsproxy in sysfs handlers
in MPTCP, SCTP and RDS. There are multiple patches to ease the
backports.
Patch 1 is not directly linked to this, but it is a small fix for MPTCP
available_schedulers sysctl knob to explicitly mark it as read-only.
Please note that this series does not address Al's comment [2]. In SCTP,
some sysctl knobs set other sysfs-exposed variables for the min/max: two
processes could then write two linked values at the same time, resulting
in new values being outside the new boundaries.
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Link: https://lore.kernel.org/netdev/20250105211158.GL1977892@ZenIV/ [2]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Matthieu Baerts (NGI0) (9):
mptcp: sysctl: avail sched: remove write access
mptcp: sysctl: sched: avoid using current->nsproxy
mptcp: sysctl: blackhole timeout: avoid using current->nsproxy
sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
sctp: sysctl: rto_min/max: avoid using current->nsproxy
sctp: sysctl: auth_enable: avoid using current->nsproxy
sctp: sysctl: udp_port: avoid using current->nsproxy
sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
net/mptcp/ctrl.c | 17 +++++++++--------
net/rds/tcp.c | 39 ++++++++++++++++++++++++++++++++-------
net/sctp/sysctl.c | 14 ++++++++------
3 files changed, 49 insertions(+), 21 deletions(-)
---
base-commit: 474ab61b2b3a9a0c31267efe807c4fed12d66d28
change-id: 20250105-mptcp-sysfs-netns-23723ba85401
Best regards,
--
Matthieu Baerts (NGI0) <matttbe@kernel.org>
On Tue, 7 Jan 2025, Matthieu Baerts (NGI0) wrote:
> As pointed out by Al Viro and Eric Dumazet in [1], using the 'net'
> structure via 'current' is not recommended for different reasons:
>
> - Inconsistency: getting info from the reader's/writer's netns vs only
> from the opener's netns.
>
> - current->nsproxy can be NULL in some cases, resulting in an 'Oops'
> (null-ptr-deref), e.g. when the current task is exiting, as spotted by
> syzbot [1] using acct(2).
>
> The 'net' or 'pernet' structure can be obtained from the table->data
> using container_of().
>
> Note that table->data could also be used directly in more places, but
> that would increase the size of this fix to replace all accesses via
> 'net'.
>
> Patches 2-9 remove access of net via current->nsproxy in sysfs handlers
> in MPTCP, SCTP and RDS. There are multiple patches to ease the
> backports.
>
> Patch 1 is not directly linked to this, but it is a small fix for MPTCP
> available_schedulers sysctl knob to explicitly mark it as read-only.
>
> Please note that this series does not address Al's comment [2]. In SCTP,
> some sysctl knobs set other sysfs-exposed variables for the min/max: two
> processes could then write two linked values at the same time, resulting
> in new values being outside the new boundaries.
Thanks Matthieu,
The mptcp patches (1-3) LGTM:
Reviewed-by: Mat Martineau <martineau@kernel.org>
The sctp and rds patches look fine as well.
- Mat
>
> Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
> Link: https://lore.kernel.org/netdev/20250105211158.GL1977892@ZenIV/ [2]
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
> Matthieu Baerts (NGI0) (9):
> mptcp: sysctl: avail sched: remove write access
> mptcp: sysctl: sched: avoid using current->nsproxy
> mptcp: sysctl: blackhole timeout: avoid using current->nsproxy
> sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
> sctp: sysctl: rto_min/max: avoid using current->nsproxy
> sctp: sysctl: auth_enable: avoid using current->nsproxy
> sctp: sysctl: udp_port: avoid using current->nsproxy
> sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
> rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
>
> net/mptcp/ctrl.c | 17 +++++++++--------
> net/rds/tcp.c | 39 ++++++++++++++++++++++++++++++++-------
> net/sctp/sysctl.c | 14 ++++++++------
> 3 files changed, 49 insertions(+), 21 deletions(-)
> ---
> base-commit: 474ab61b2b3a9a0c31267efe807c4fed12d66d28
> change-id: 20250105-mptcp-sysfs-netns-23723ba85401
>
> Best regards,
> --
> Matthieu Baerts (NGI0) <matttbe@kernel.org>
>
>
>
Hi Mat,
On 08/01/2025 02:49, Mat Martineau wrote:
> On Tue, 7 Jan 2025, Matthieu Baerts (NGI0) wrote:
>
>> As pointed out by Al Viro and Eric Dumazet in [1], using the 'net'
>> structure via 'current' is not recommended for different reasons:
>>
>> - Inconsistency: getting info from the reader's/writer's netns vs only
>> from the opener's netns.
>>
>> - current->nsproxy can be NULL in some cases, resulting in an 'Oops'
>> (null-ptr-deref), e.g. when the current task is exiting, as spotted by
>> syzbot [1] using acct(2).
>>
>> The 'net' or 'pernet' structure can be obtained from the table->data
>> using container_of().
>>
>> Note that table->data could also be used directly in more places, but
>> that would increase the size of this fix to replace all accesses via
>> 'net'.
>>
>> Patches 2-9 remove access of net via current->nsproxy in sysfs handlers
>> in MPTCP, SCTP and RDS. There are multiple patches to ease the
>> backports.
>>
>> Patch 1 is not directly linked to this, but it is a small fix for MPTCP
>> available_schedulers sysctl knob to explicitly mark it as read-only.
>>
>> Please note that this series does not address Al's comment [2]. In SCTP,
>> some sysctl knobs set other sysfs-exposed variables for the min/max: two
>> processes could then write two linked values at the same time, resulting
>> in new values being outside the new boundaries.
>
>
> Thanks Matthieu,
>
> The mptcp patches (1-3) LGTM:
>
> Reviewed-by: Mat Martineau <martineau@kernel.org>
>
>
> The sctp and rds patches look fine as well.
Thank you for this review!
Patches are now in our tree (fixes for -net) with your RvB tag on the 3
first ones. I will upstream them later on today.
New patches for t/upstream-net and t/upstream:
- 968e9bb5a4cd: mptcp: sysctl: avail sched: remove write access
- eda363130f80: mptcp: sysctl: sched: avoid using current->nsproxy
- c838529f87ee: mptcp: sysctl: blackhole timeout: avoid using
current->nsproxy
- fb1a7a8fb16c: sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
- a18c9552dca0: sctp: sysctl: rto_min/max: avoid using current->nsproxy
- 8408f9182e52: sctp: sysctl: auth_enable: avoid using current->nsproxy
- c45cb99c9aa0: sctp: sysctl: udp_port: avoid using current->nsproxy
- 3df3f684ee7e: sctp: sysctl: plpmtud_probe_interval: avoid using
current->nsproxy
- a77b912deb35: rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using
current->nsproxy
- Results: bd3d4b1ab1bc..d66993cd4a59 (export-net)
- Results: ebe8ebeee62e..f289bc80702c (export)
Tests are now in progress:
- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/6c101bff7e1613c0087472d2ee78c986e221435e/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/0e96d0da96b041f087c5c0b46bc34978572ccffe/checks
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
Hi Matthieu,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/12656267762
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/6e18f999ad48
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=923074
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
© 2016 - 2026 Red Hat, Inc.