When the server has MPTCP enabled but receives a non-MP-capable request
from a client, it calls mptcp_fallback_tcp_ops().
Since non-MPTCP connections are allowed to use sockmap, which replaces
sk->sk_prot, using sk->sk_prot to determine the IP version in
mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
incorrect ops to sk->sk_socket->ops.
Additionally, when BPF Sockmap modifies the protocol handlers, the
original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
trigger warnings.
Fix this by using the more stable sk_family to distinguish between IPv4
and IPv6 connections, ensuring correct fallback protocol operations are
selected even when BPF Sockmap has modified the socket protocol handlers.
Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
---
net/mptcp/protocol.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0292162a14ee..2393741bc310 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
{
+ /* When BPF sockmap is used, it may replace sk->sk_prot.
+ * Using sk_family is a reliable way to determine the IP version.
+ */
+ unsigned short family = READ_ONCE(sk->sk_family);
+
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
- if (sk->sk_prot == &tcpv6_prot)
+ if (family == AF_INET6)
return &inet6_stream_ops;
#endif
- WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
+ WARN_ON_ONCE(family != AF_INET);
return &inet_stream_ops;
}
--
2.43.0
On 10/23/25 2:54 PM, Jiayuan Chen wrote: > When the server has MPTCP enabled but receives a non-MP-capable request > from a client, it calls mptcp_fallback_tcp_ops(). > > Since non-MPTCP connections are allowed to use sockmap, which replaces > sk->sk_prot, using sk->sk_prot to determine the IP version in > mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning > incorrect ops to sk->sk_socket->ops. I don't see how sockmap could modify the to-be-accepted socket sk_prot before mptcp_fallback_tcp_ops(), as such call happens before the fd is installed, and AFAICS sockmap can only fetch sockets via fds. Is this patch needed? /P
October 28, 2025 at 19:30, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
>
> On 10/23/25 2:54 PM, Jiayuan Chen wrote:
>
> >
> > When the server has MPTCP enabled but receives a non-MP-capable request
> > from a client, it calls mptcp_fallback_tcp_ops().
> >
> > Since non-MPTCP connections are allowed to use sockmap, which replaces
> > sk->sk_prot, using sk->sk_prot to determine the IP version in
> > mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> > incorrect ops to sk->sk_socket->ops.
> >
> I don't see how sockmap could modify the to-be-accepted socket sk_prot
> before mptcp_fallback_tcp_ops(), as such call happens before the fd is
> installed, and AFAICS sockmap can only fetch sockets via fds.
>
> Is this patch needed?
"mptcp_fallback_tcp_ops" is only called during the accept process. However,
before that, for an already established TCP socket, its sk_prot is replaced via the following path:
tcp_rcv_state_process()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
call bpf prog
bpf_sock_map_update(sk)
tcp_bpf_update_proto()
However, after discussing with Matthieu, we've concluded that this patch is indeed no
longer necessary, as we have a simpler way to intercept the operation."
Thanks~
On 10/28/25 12:30 PM, Paolo Abeni wrote: > On 10/23/25 2:54 PM, Jiayuan Chen wrote: >> When the server has MPTCP enabled but receives a non-MP-capable request >> from a client, it calls mptcp_fallback_tcp_ops(). >> >> Since non-MPTCP connections are allowed to use sockmap, which replaces >> sk->sk_prot, using sk->sk_prot to determine the IP version in >> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning >> incorrect ops to sk->sk_socket->ops. > > I don't see how sockmap could modify the to-be-accepted socket sk_prot > before mptcp_fallback_tcp_ops(), as such call happens before the fd is > installed, and AFAICS sockmap can only fetch sockets via fds. > > Is this patch needed? Matttbe explained off-list the details of how that could happen. I think the commit message here must be more verbose to explain clearly the whys, even to those non proficient in sockmap like me. Thanks, Paolo
October 28, 2025 at 19:47, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote: > > On 10/28/25 12:30 PM, Paolo Abeni wrote: > > > > > On 10/23/25 2:54 PM, Jiayuan Chen wrote: > > > > > > > > When the server has MPTCP enabled but receives a non-MP-capable request > > > from a client, it calls mptcp_fallback_tcp_ops(). > > > > > > Since non-MPTCP connections are allowed to use sockmap, which replaces > > > sk->sk_prot, using sk->sk_prot to determine the IP version in > > > mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning > > > incorrect ops to sk->sk_socket->ops. > > > > > > > I don't see how sockmap could modify the to-be-accepted socket sk_prot > > before mptcp_fallback_tcp_ops(), as such call happens before the fd is > > installed, and AFAICS sockmap can only fetch sockets via fds. > > > > Is this patch needed? > > > Matttbe explained off-list the details of how that could happen. I think > the commit message here must be more verbose to explain clearly the > whys, even to those non proficient in sockmap like me. > > Thanks, > > Paolo > Thanks, I will add more details into commit message :).
Hi Jiayuan,
On 23/10/2025 14:54, Jiayuan Chen wrote:
> When the server has MPTCP enabled but receives a non-MP-capable request
> from a client, it calls mptcp_fallback_tcp_ops().
>
> Since non-MPTCP connections are allowed to use sockmap, which replaces
> sk->sk_prot, using sk->sk_prot to determine the IP version in
> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> incorrect ops to sk->sk_socket->ops.
>
> Additionally, when BPF Sockmap modifies the protocol handlers, the
> original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
> trigger warnings.
>
> Fix this by using the more stable sk_family to distinguish between IPv4
> and IPv6 connections, ensuring correct fallback protocol operations are
> selected even when BPF Sockmap has modified the socket protocol handlers.
>
> Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> net/mptcp/protocol.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 0292162a14ee..2393741bc310 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
>
> static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
> {
> + /* When BPF sockmap is used, it may replace sk->sk_prot.
> + * Using sk_family is a reliable way to determine the IP version.
> + */
> + unsigned short family = READ_ONCE(sk->sk_family);
> +
> #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> - if (sk->sk_prot == &tcpv6_prot)
> + if (family == AF_INET6)
> return &inet6_stream_ops;
> #endif
> - WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
> + WARN_ON_ONCE(family != AF_INET);
> return &inet_stream_ops;
Just to be sure: is there anything in BPF modifying sk->sk_socket->ops?
Because that's what mptcp_fallback_tcp_ops() will do somehow.
In other words, is it always fine to set inet(6)_stream_ops? (I guess
yes, but better to be sure while we are looking at that :) )
> }
>
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
October 23, 2025 at 22:10, "Matthieu Baerts" <matttbe@kernel.org mailto:matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel.org%3E > wrote:
>
> Hi Jiayuan,
>
> On 23/10/2025 14:54, Jiayuan Chen wrote:
>
> >
> > When the server has MPTCP enabled but receives a non-MP-capable request
> > from a client, it calls mptcp_fallback_tcp_ops().
> >
> > Since non-MPTCP connections are allowed to use sockmap, which replaces
> > sk->sk_prot, using sk->sk_prot to determine the IP version in
> > mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> > incorrect ops to sk->sk_socket->ops.
> >
> > Additionally, when BPF Sockmap modifies the protocol handlers, the
> > original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
> > trigger warnings.
> >
> > Fix this by using the more stable sk_family to distinguish between IPv4
> > and IPv6 connections, ensuring correct fallback protocol operations are
> > selected even when BPF Sockmap has modified the socket protocol handlers.
> >
> > Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
> > ---
> > net/mptcp/protocol.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index 0292162a14ee..2393741bc310 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
> >
> > static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
> > {
> > + /* When BPF sockmap is used, it may replace sk->sk_prot.
> > + * Using sk_family is a reliable way to determine the IP version.
> > + */
> > + unsigned short family = READ_ONCE(sk->sk_family);
> > +
> > #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> > - if (sk->sk_prot == &tcpv6_prot)
> > + if (family == AF_INET6)
> > return &inet6_stream_ops;
> > #endif
> > - WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
> > + WARN_ON_ONCE(family != AF_INET);
> > return &inet_stream_ops;
> >
> Just to be sure: is there anything in BPF modifying sk->sk_socket->ops?
> Because that's what mptcp_fallback_tcp_ops() will do somehow.
>
> In other words, is it always fine to set inet(6)_stream_ops? (I guess
> yes, but better to be sure while we are looking at that :) )
Hi Matt,
I can confirm that on the BPF side, the only special operations targeting
sockets currently are sockmap/sockhash. Their implementations do not modify
sk->sk_socket->ops. Currently, they only modify sk->prot, because the BPF
side typically operates on 'struct sock' and does not concern itself with
'struct socket'.
Therefore, setting inet(6)_stream_ops is fine.
Thanks,
Jiayuan
> >
> > }
> >
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
© 2016 - 2026 Red Hat, Inc.