net/mptcp/protocol.c | 2 ++ net/mptcp/protocol.h | 1 + net/mptcp/subflow.c | 15 +++++++++------ 3 files changed, 12 insertions(+), 6 deletions(-)
The ehash table lookups are lockless and rely on
SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
during RCU read-side critical sections. Both tcp_prot and
tcpv6_prot have their slab caches created with this flag
via proto_register().
However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
tcpv6_prot_override during inet_init() (fs_initcall, level 5),
before inet6_init() (module_init/device_initcall, level 6) has
called proto_register(&tcpv6_prot). At that point,
tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
remains NULL permanently.
This causes MPTCP v6 subflow child sockets to be allocated via
kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
when these sockets are freed without SOCK_RCU_FREE (which is
cleared for child sockets by design), the memory can be
immediately reused. Concurrent ehash lookups under
rcu_read_lock can then access freed memory, triggering a
slab-use-after-free in __inet_lookup_established.
Fix this by splitting the IPv6-specific initialization out of
mptcp_subflow_init() into a new mptcp_subflow_v6_init(), called
from mptcp_proto_v6_init() before protocol registration. This
ensures tcpv6_prot_override.slab correctly inherits the
SLAB_TYPESAFE_BY_RCU slab cache.
Fixes: b19bc2945b40 ("mptcp: implement delegated actions")
Cc: stable@vger.kernel.org
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
net/mptcp/protocol.c | 2 ++
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 15 +++++++++------
3 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 65c3bb8016f4..614c3f583ca0 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -4660,6 +4660,8 @@ int __init mptcp_proto_v6_init(void)
{
int err;
+ mptcp_subflow_v6_init();
+
mptcp_v6_prot = mptcp_prot;
strscpy(mptcp_v6_prot.name, "MPTCPv6", sizeof(mptcp_v6_prot.name));
mptcp_v6_prot.slab = NULL;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 0bd1ee860316..ec15e503da8b 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -875,6 +875,7 @@ static inline void mptcp_subflow_tcp_fallback(struct sock *sk,
void __init mptcp_proto_init(void);
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
int __init mptcp_proto_v6_init(void);
+void __init mptcp_subflow_v6_init(void);
#endif
struct sock *mptcp_sk_clone_init(const struct sock *sk,
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 6716970693e9..4ff5863aa9fd 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -2165,7 +2165,15 @@ void __init mptcp_subflow_init(void)
tcp_prot_override.psock_update_sk_prot = NULL;
#endif
+ mptcp_diag_subflow_init(&subflow_ulp_ops);
+
+ if (tcp_register_ulp(&subflow_ulp_ops) != 0)
+ panic("MPTCP: failed to register subflows to ULP\n");
+}
+
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
+void __init mptcp_subflow_v6_init(void)
+{
/* In struct mptcp_subflow_request_sock, we assume the TCP request sock
* structures for v4 and v6 have the same size. It should not changed in
* the future but better to make sure to be warned if it is no longer
@@ -2204,10 +2212,5 @@ void __init mptcp_subflow_init(void)
/* Disable sockmap processing for subflows */
tcpv6_prot_override.psock_update_sk_prot = NULL;
#endif
-#endif
-
- mptcp_diag_subflow_init(&subflow_ulp_ops);
-
- if (tcp_register_ulp(&subflow_ulp_ops) != 0)
- panic("MPTCP: failed to register subflows to ULP\n");
}
+#endif
--
2.43.0
Hi Jiayuan,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_dss packetdrill_fastopen ⚠️
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/24029252526
Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/269f5127d03d
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1077546
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 6 Apr 2026 11:15:10 +0800 you wrote:
> The ehash table lookups are lockless and rely on
> SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
> during RCU read-side critical sections. Both tcp_prot and
> tcpv6_prot have their slab caches created with this flag
> via proto_register().
>
> However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
> tcpv6_prot_override during inet_init() (fs_initcall, level 5),
> before inet6_init() (module_init/device_initcall, level 6) has
> called proto_register(&tcpv6_prot). At that point,
> tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
> remains NULL permanently.
>
> [...]
Here is the summary with links:
- [net,v2] mptcp: fix slab-use-after-free in __inet_lookup_established
https://git.kernel.org/netdev/net/c/9b55b253907e
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
Hi Jiayuan, On 06/04/2026 05:15, Jiayuan Chen wrote: > The ehash table lookups are lockless and rely on > SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability > during RCU read-side critical sections. Both tcp_prot and > tcpv6_prot have their slab caches created with this flag > via proto_register(). > > However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into > tcpv6_prot_override during inet_init() (fs_initcall, level 5), > before inet6_init() (module_init/device_initcall, level 6) has > called proto_register(&tcpv6_prot). At that point, > tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab > remains NULL permanently. > > This causes MPTCP v6 subflow child sockets to be allocated via > kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab > cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so > when these sockets are freed without SOCK_RCU_FREE (which is > cleared for child sockets by design), the memory can be > immediately reused. Concurrent ehash lookups under > rcu_read_lock can then access freed memory, triggering a > slab-use-after-free in __inet_lookup_established. > > Fix this by splitting the IPv6-specific initialization out of > mptcp_subflow_init() into a new mptcp_subflow_v6_init(), called > from mptcp_proto_v6_init() before protocol registration. This > ensures tcpv6_prot_override.slab correctly inherits the > SLAB_TYPESAFE_BY_RCU slab cache. Thank you for this v2, it looks good to me: Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> @Netdev maintainers: this patch can be applied in 'net' directly. Cheers, Matt -- Sponsored by the NGI0 Core fund.
© 2016 - 2026 Red Hat, Inc.