[PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap

Jiayuan Chen posted 3 patches 1 week, 1 day ago
[PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
Posted by Jiayuan Chen 1 week, 1 day ago
When the server has MPTCP enabled but receives a non-MP-capable request
from a client, it calls mptcp_fallback_tcp_ops().

Since non-MPTCP connections are allowed to use sockmap, which replaces
sk->sk_prot, using sk->sk_prot to determine the IP version in
mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
incorrect ops to sk->sk_socket->ops.

Additionally, when BPF Sockmap modifies the protocol handlers, the
original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
trigger warnings.

Fix this by using the more stable sk_family to distinguish between IPv4
and IPv6 connections, ensuring correct fallback protocol operations are
selected even when BPF Sockmap has modified the socket protocol handlers.

Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/mptcp/protocol.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0292162a14ee..2393741bc310 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
 
 static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
 {
+	/* When BPF sockmap is used, it may replace sk->sk_prot.
+	 * Using sk_family is a reliable way to determine the IP version.
+	 */
+	unsigned short family = READ_ONCE(sk->sk_family);
+
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
-	if (sk->sk_prot == &tcpv6_prot)
+	if (family == AF_INET6)
 		return &inet6_stream_ops;
 #endif
-	WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
+	WARN_ON_ONCE(family != AF_INET);
 	return &inet_stream_ops;
 }
 
-- 
2.43.0
Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
Posted by Paolo Abeni 3 days, 4 hours ago
On 10/23/25 2:54 PM, Jiayuan Chen wrote:
> When the server has MPTCP enabled but receives a non-MP-capable request
> from a client, it calls mptcp_fallback_tcp_ops().
> 
> Since non-MPTCP connections are allowed to use sockmap, which replaces
> sk->sk_prot, using sk->sk_prot to determine the IP version in
> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> incorrect ops to sk->sk_socket->ops.

I don't see how sockmap could modify the to-be-accepted socket sk_prot
before mptcp_fallback_tcp_ops(), as such call happens before the fd is
installed, and AFAICS sockmap can only fetch sockets via fds.

Is this patch needed?

/P
Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
Posted by Paolo Abeni 3 days, 4 hours ago
On 10/28/25 12:30 PM, Paolo Abeni wrote:
> On 10/23/25 2:54 PM, Jiayuan Chen wrote:
>> When the server has MPTCP enabled but receives a non-MP-capable request
>> from a client, it calls mptcp_fallback_tcp_ops().
>>
>> Since non-MPTCP connections are allowed to use sockmap, which replaces
>> sk->sk_prot, using sk->sk_prot to determine the IP version in
>> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
>> incorrect ops to sk->sk_socket->ops.
> 
> I don't see how sockmap could modify the to-be-accepted socket sk_prot
> before mptcp_fallback_tcp_ops(), as such call happens before the fd is
> installed, and AFAICS sockmap can only fetch sockets via fds.
> 
> Is this patch needed?

Matttbe explained off-list the details of how that could happen. I think
the commit message here must be more verbose to explain clearly the
whys, even to those non proficient in sockmap like me.

Thanks,

Paolo
Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
Posted by Matthieu Baerts 1 week, 1 day ago
Hi Jiayuan,

On 23/10/2025 14:54, Jiayuan Chen wrote:
> When the server has MPTCP enabled but receives a non-MP-capable request
> from a client, it calls mptcp_fallback_tcp_ops().
> 
> Since non-MPTCP connections are allowed to use sockmap, which replaces
> sk->sk_prot, using sk->sk_prot to determine the IP version in
> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> incorrect ops to sk->sk_socket->ops.
> 
> Additionally, when BPF Sockmap modifies the protocol handlers, the
> original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
> trigger warnings.
> 
> Fix this by using the more stable sk_family to distinguish between IPv4
> and IPv6 connections, ensuring correct fallback protocol operations are
> selected even when BPF Sockmap has modified the socket protocol handlers.
> 
> Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/mptcp/protocol.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 0292162a14ee..2393741bc310 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
>  
>  static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
>  {
> +	/* When BPF sockmap is used, it may replace sk->sk_prot.
> +	 * Using sk_family is a reliable way to determine the IP version.
> +	 */
> +	unsigned short family = READ_ONCE(sk->sk_family);
> +
>  #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> -	if (sk->sk_prot == &tcpv6_prot)
> +	if (family == AF_INET6)
>  		return &inet6_stream_ops;
>  #endif
> -	WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
> +	WARN_ON_ONCE(family != AF_INET);
>  	return &inet_stream_ops;

Just to be sure: is there anything in BPF modifying sk->sk_socket->ops?
Because that's what mptcp_fallback_tcp_ops() will do somehow.

In other words, is it always fine to set inet(6)_stream_ops? (I guess
yes, but better to be sure while we are looking at that :) )

>  }
>  

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
Posted by Jiayuan Chen 1 week, 1 day ago
October 23, 2025 at 22:10, "Matthieu Baerts" <matttbe@kernel.org mailto:matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel.org%3E > wrote:


> 
> Hi Jiayuan,
> 
> On 23/10/2025 14:54, Jiayuan Chen wrote:
> 
> > 
> > When the server has MPTCP enabled but receives a non-MP-capable request
> >  from a client, it calls mptcp_fallback_tcp_ops().
> >  
> >  Since non-MPTCP connections are allowed to use sockmap, which replaces
> >  sk->sk_prot, using sk->sk_prot to determine the IP version in
> >  mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> >  incorrect ops to sk->sk_socket->ops.
> >  
> >  Additionally, when BPF Sockmap modifies the protocol handlers, the
> >  original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
> >  trigger warnings.
> >  
> >  Fix this by using the more stable sk_family to distinguish between IPv4
> >  and IPv6 connections, ensuring correct fallback protocol operations are
> >  selected even when BPF Sockmap has modified the socket protocol handlers.
> >  
> >  Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
> >  Cc: <stable@vger.kernel.org>
> >  Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> >  Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
> >  ---
> >  net/mptcp/protocol.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >  
> >  diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> >  index 0292162a14ee..2393741bc310 100644
> >  --- a/net/mptcp/protocol.c
> >  +++ b/net/mptcp/protocol.c
> >  @@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
> >  
> >  static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
> >  {
> >  + /* When BPF sockmap is used, it may replace sk->sk_prot.
> >  + * Using sk_family is a reliable way to determine the IP version.
> >  + */
> >  + unsigned short family = READ_ONCE(sk->sk_family);
> >  +
> >  #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> >  - if (sk->sk_prot == &tcpv6_prot)
> >  + if (family == AF_INET6)
> >  return &inet6_stream_ops;
> >  #endif
> >  - WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
> >  + WARN_ON_ONCE(family != AF_INET);
> >  return &inet_stream_ops;
> > 
> Just to be sure: is there anything in BPF modifying sk->sk_socket->ops?
> Because that's what mptcp_fallback_tcp_ops() will do somehow.
> 
> In other words, is it always fine to set inet(6)_stream_ops? (I guess
> yes, but better to be sure while we are looking at that :) )



Hi Matt,

I can confirm that on the BPF side, the only special operations targeting
sockets currently are sockmap/sockhash. Their implementations do not modify
sk->sk_socket->ops. Currently, they only modify sk->prot, because the BPF
side typically operates on 'struct sock' and does not concern itself with
'struct socket'.

Therefore, setting inet(6)_stream_ops is fine.

Thanks,
Jiayuan

> > 
> > }
> > 
> Cheers,
> Matt
> -- 
> Sponsored by the NGI0 Core fund.
>