net/mptcp/ctrl.c | 17 +++++++++-------- net/rds/tcp.c | 39 ++++++++++++++++++++++++++++++++------- net/sctp/sysctl.c | 14 ++++++++------ 3 files changed, 49 insertions(+), 21 deletions(-)
As pointed out by Al Viro and Eric Dumazet in [1], using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' or 'pernet' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly in more places, but that would increase the size of this fix to replace all accesses via 'net'. Patches 2-9 remove access of net via current->nsproxy in sysfs handlers in MPTCP, SCTP and RDS. There are multiple patches to ease the backports. Patch 1 is not directly linked to this, but it is a small fix for MPTCP available_schedulers sysctl knob to explicitly mark it as read-only. Please note that this series does not address Al's comment [2]. In SCTP, some sysctl knobs set other sysfs-exposed variables for the min/max: two processes could then write two linked values at the same time, resulting in new values being outside the new boundaries. Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Link: https://lore.kernel.org/netdev/20250105211158.GL1977892@ZenIV/ [2] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Matthieu Baerts (NGI0) (9): mptcp: sysctl: avail sched: remove write access mptcp: sysctl: sched: avoid using current->nsproxy mptcp: sysctl: blackhole timeout: avoid using current->nsproxy sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy sctp: sysctl: rto_min/max: avoid using current->nsproxy sctp: sysctl: auth_enable: avoid using current->nsproxy sctp: sysctl: udp_port: avoid using current->nsproxy sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy net/mptcp/ctrl.c | 17 +++++++++-------- net/rds/tcp.c | 39 ++++++++++++++++++++++++++++++++------- net/sctp/sysctl.c | 14 ++++++++------ 3 files changed, 49 insertions(+), 21 deletions(-) --- base-commit: 474ab61b2b3a9a0c31267efe807c4fed12d66d28 change-id: 20250105-mptcp-sysfs-netns-23723ba85401 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
On Tue, 7 Jan 2025, Matthieu Baerts (NGI0) wrote: > As pointed out by Al Viro and Eric Dumazet in [1], using the 'net' > structure via 'current' is not recommended for different reasons: > > - Inconsistency: getting info from the reader's/writer's netns vs only > from the opener's netns. > > - current->nsproxy can be NULL in some cases, resulting in an 'Oops' > (null-ptr-deref), e.g. when the current task is exiting, as spotted by > syzbot [1] using acct(2). > > The 'net' or 'pernet' structure can be obtained from the table->data > using container_of(). > > Note that table->data could also be used directly in more places, but > that would increase the size of this fix to replace all accesses via > 'net'. > > Patches 2-9 remove access of net via current->nsproxy in sysfs handlers > in MPTCP, SCTP and RDS. There are multiple patches to ease the > backports. > > Patch 1 is not directly linked to this, but it is a small fix for MPTCP > available_schedulers sysctl knob to explicitly mark it as read-only. > > Please note that this series does not address Al's comment [2]. In SCTP, > some sysctl knobs set other sysfs-exposed variables for the min/max: two > processes could then write two linked values at the same time, resulting > in new values being outside the new boundaries. Thanks Matthieu, The mptcp patches (1-3) LGTM: Reviewed-by: Mat Martineau <martineau@kernel.org> The sctp and rds patches look fine as well. - Mat > > Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] > Link: https://lore.kernel.org/netdev/20250105211158.GL1977892@ZenIV/ [2] > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > --- > Matthieu Baerts (NGI0) (9): > mptcp: sysctl: avail sched: remove write access > mptcp: sysctl: sched: avoid using current->nsproxy > mptcp: sysctl: blackhole timeout: avoid using current->nsproxy > sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy > sctp: sysctl: rto_min/max: avoid using current->nsproxy > sctp: sysctl: auth_enable: avoid using current->nsproxy > sctp: sysctl: udp_port: avoid using current->nsproxy > sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy > rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy > > net/mptcp/ctrl.c | 17 +++++++++-------- > net/rds/tcp.c | 39 ++++++++++++++++++++++++++++++++------- > net/sctp/sysctl.c | 14 ++++++++------ > 3 files changed, 49 insertions(+), 21 deletions(-) > --- > base-commit: 474ab61b2b3a9a0c31267efe807c4fed12d66d28 > change-id: 20250105-mptcp-sysfs-netns-23723ba85401 > > Best regards, > -- > Matthieu Baerts (NGI0) <matttbe@kernel.org> > > >
Hi Mat, On 08/01/2025 02:49, Mat Martineau wrote: > On Tue, 7 Jan 2025, Matthieu Baerts (NGI0) wrote: > >> As pointed out by Al Viro and Eric Dumazet in [1], using the 'net' >> structure via 'current' is not recommended for different reasons: >> >> - Inconsistency: getting info from the reader's/writer's netns vs only >> from the opener's netns. >> >> - current->nsproxy can be NULL in some cases, resulting in an 'Oops' >> (null-ptr-deref), e.g. when the current task is exiting, as spotted by >> syzbot [1] using acct(2). >> >> The 'net' or 'pernet' structure can be obtained from the table->data >> using container_of(). >> >> Note that table->data could also be used directly in more places, but >> that would increase the size of this fix to replace all accesses via >> 'net'. >> >> Patches 2-9 remove access of net via current->nsproxy in sysfs handlers >> in MPTCP, SCTP and RDS. There are multiple patches to ease the >> backports. >> >> Patch 1 is not directly linked to this, but it is a small fix for MPTCP >> available_schedulers sysctl knob to explicitly mark it as read-only. >> >> Please note that this series does not address Al's comment [2]. In SCTP, >> some sysctl knobs set other sysfs-exposed variables for the min/max: two >> processes could then write two linked values at the same time, resulting >> in new values being outside the new boundaries. > > > Thanks Matthieu, > > The mptcp patches (1-3) LGTM: > > Reviewed-by: Mat Martineau <martineau@kernel.org> > > > The sctp and rds patches look fine as well. Thank you for this review! Patches are now in our tree (fixes for -net) with your RvB tag on the 3 first ones. I will upstream them later on today. New patches for t/upstream-net and t/upstream: - 968e9bb5a4cd: mptcp: sysctl: avail sched: remove write access - eda363130f80: mptcp: sysctl: sched: avoid using current->nsproxy - c838529f87ee: mptcp: sysctl: blackhole timeout: avoid using current->nsproxy - fb1a7a8fb16c: sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy - a18c9552dca0: sctp: sysctl: rto_min/max: avoid using current->nsproxy - 8408f9182e52: sctp: sysctl: auth_enable: avoid using current->nsproxy - c45cb99c9aa0: sctp: sysctl: udp_port: avoid using current->nsproxy - 3df3f684ee7e: sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy - a77b912deb35: rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy - Results: bd3d4b1ab1bc..d66993cd4a59 (export-net) - Results: ebe8ebeee62e..f289bc80702c (export) Tests are now in progress: - export-net: https://github.com/multipath-tcp/mptcp_net-next/commit/6c101bff7e1613c0087472d2ee78c986e221435e/checks - export: https://github.com/multipath-tcp/mptcp_net-next/commit/0e96d0da96b041f087c5c0b46bc34978572ccffe/checks Cheers, Matt -- Sponsored by the NGI0 Core fund.
Hi Matthieu, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/12656267762 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/6e18f999ad48 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=923074 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
© 2016 - 2025 Red Hat, Inc.