[PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap

Jiayuan Chen posted 3 patches 1 week, 1 day ago
[PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap
Posted by Jiayuan Chen 1 week, 1 day ago
MPTCP creates subflows for data transmission, and these sockets should not
be added to sockmap because MPTCP sets specialized data_ready handlers
that would be overridden by sockmap.

Additionally, for the parent socket of MPTCP subflows (plain TCP socket),
MPTCP sk requires specific protocol handling that conflicts with sockmap's
operation(mptcp_prot).

This patch adds proper checks to reject MPTCP subflows and their parent
sockets from being added to sockmap, while preserving compatibility with
reuseport functionality for listening MPTCP sockets.

We cannot add this logic to sock_map_sk_state_allowed() because the sockops
path doesn't execute this function, and the socket state coming from
sockops might be in states like SYN_RECV. So moving
sock_map_sk_state_allowed() to sock_{map,hash}_update_common() is not
appropriate. Instead, we introduce a new function to handle MPTCP checks.

Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/sock_map.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 5947b38e4f8b..5be38cdfb5cc 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -467,6 +467,27 @@ static int sock_map_get_next_key(struct bpf_map *map, void *key, void *next)
 	return 0;
 }
 
+/* Disallow MPTCP subflows and their parent sockets. However, a TCP_LISTEN
+ * MPTCP socket is permitted because sockmap can also serve for reuseport
+ * socket selection.
+ */
+static inline bool sock_map_sk_type_allowed(const struct sock *sk)
+{
+	/* MPTCP subflows are not intended for data I/O by user */
+	if (sk_is_tcp(sk) && sk_is_mptcp(sk))
+		goto disallow;
+
+	/* MPTCP parents use mptcp_prot - not supported with sockmap yet */
+	if (sk->sk_protocol == IPPROTO_MPTCP && sk->sk_state != TCP_LISTEN)
+		goto disallow;
+
+	return true;
+
+disallow:
+	pr_err_once("sockmap/sockhash: MPTCP sockets are not supported\n");
+	return false;
+}
+
 static int sock_map_update_common(struct bpf_map *map, u32 idx,
 				  struct sock *sk, u64 flags)
 {
@@ -482,6 +503,9 @@ static int sock_map_update_common(struct bpf_map *map, u32 idx,
 	if (unlikely(idx >= map->max_entries))
 		return -E2BIG;
 
+	if (!sock_map_sk_type_allowed(sk))
+		return -EOPNOTSUPP;
+
 	link = sk_psock_init_link();
 	if (!link)
 		return -ENOMEM;
@@ -1003,6 +1027,9 @@ static int sock_hash_update_common(struct bpf_map *map, void *key,
 	if (unlikely(flags > BPF_EXIST))
 		return -EINVAL;
 
+	if (!sock_map_sk_type_allowed(sk))
+		return -EOPNOTSUPP;
+
 	link = sk_psock_init_link();
 	if (!link)
 		return -ENOMEM;
-- 
2.43.0
Re: [PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap
Posted by Paolo Abeni 3 days, 4 hours ago
On 10/23/25 2:54 PM, Jiayuan Chen wrote:
> MPTCP creates subflows for data transmission, and these sockets should not
> be added to sockmap because MPTCP sets specialized data_ready handlers
> that would be overridden by sockmap.
> 
> Additionally, for the parent socket of MPTCP subflows (plain TCP socket),
> MPTCP sk requires specific protocol handling that conflicts with sockmap's
> operation(mptcp_prot).
> 
> This patch adds proper checks to reject MPTCP subflows and their parent
> sockets from being added to sockmap, while preserving compatibility with
> reuseport functionality for listening MPTCP sockets.

It's unclear to me why that is safe. sockmap is going to change the
listener msk proto ops.

The listener could disconnect and create an egress connection, still
using the wrong ops.

I think sockmap should always be prevented for mptcp socket, or at least
a solid explanation of why such exception is safe should be included in
the commit message.

Note that the first option allows for solving the issue entirely in the
mptcp code, setting dummy/noop psock_update_sk_prot for mptcp sockets
and mptcp subflows.

/P