If a passive MPTCP socket creates active subflows while still unaccepted,
__mptcp_subflow_connect() will try to graft such subflows to the msk,
but the msk struct socket is not yet initialized at that point:
the subflows will misbehave.
Address the issue always trying to graft the subflow in
mptcp_finish_join(), regardless of the subflow itself being active or
passive. To avoid races with accept(), access the msk->sk_socket under
the callback lock.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
net/mptcp/protocol.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 8965abb94b81..1b3c5fd01600 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -913,12 +913,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
mptcp_subflow_joined(msk, ssk);
spin_unlock_bh(&msk->fallback_lock);
- /* attach to msk socket only after we are sure we will deal with it
- * at close time
- */
- if (sk->sk_socket && !ssk->sk_socket)
- mptcp_sock_graft(ssk, sk->sk_socket);
-
mptcp_subflow_ctx(ssk)->subflow_id = msk->subflow_id++;
mptcp_sockopt_sync_locked(msk, ssk);
mptcp_stop_tout_timer(sk);
@@ -3734,6 +3728,20 @@ void mptcp_sock_graft(struct sock *sk, struct socket *parent)
write_unlock_bh(&sk->sk_callback_lock);
}
+static void mptcp_check_graft(struct sock *sk, struct sock *ssk)
+{
+ struct socket *sock;
+
+ if (ssk->sk_socket)
+ return;
+
+ write_lock_bh(&sk->sk_callback_lock);
+ sock = sk->sk_socket;
+ write_lock_bh(&sk->sk_callback_lock);
+ if (sock)
+ mptcp_sock_graft(ssk, sock);
+}
+
bool mptcp_finish_join(struct sock *ssk)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
@@ -3758,6 +3766,7 @@ bool mptcp_finish_join(struct sock *ssk)
}
mptcp_subflow_joined(msk, ssk);
spin_unlock_bh(&msk->fallback_lock);
+ mptcp_check_graft(parent, ssk);
mptcp_propagate_sndbuf(parent, ssk);
return true;
}
@@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk)
goto err_prohibited;
}
+ mptcp_check_graft(parent, ssk);
+
/* If we can't acquire msk socket lock here, let the release callback
* handle it
*/
--
2.51.1
Hi Paolo,
On 13/11/2025 01:10, Paolo Abeni wrote:
> If a passive MPTCP socket creates active subflows while still unaccepted,
> __mptcp_subflow_connect() will try to graft such subflows to the msk,
> but the msk struct socket is not yet initialized at that point:
> the subflows will misbehave.
What kind of errors were visible?
> Address the issue always trying to graft the subflow in
> mptcp_finish_join(), regardless of the subflow itself being active or
> passive. To avoid races with accept(), access the msk->sk_socket under
> the callback lock.
Thank you for addressing this issue!
>
By chance, do you have a Fixes tag to add here? (if it is for -net)
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> net/mptcp/protocol.c | 23 +++++++++++++++++------
> 1 file changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 8965abb94b81..1b3c5fd01600 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -913,12 +913,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
> mptcp_subflow_joined(msk, ssk);
> spin_unlock_bh(&msk->fallback_lock);
>
> - /* attach to msk socket only after we are sure we will deal with it
> - * at close time
> - */
> - if (sk->sk_socket && !ssk->sk_socket)
> - mptcp_sock_graft(ssk, sk->sk_socket);
> -
> mptcp_subflow_ctx(ssk)->subflow_id = msk->subflow_id++;
> mptcp_sockopt_sync_locked(msk, ssk);
> mptcp_stop_tout_timer(sk);
> @@ -3734,6 +3728,20 @@ void mptcp_sock_graft(struct sock *sk, struct socket *parent)
> write_unlock_bh(&sk->sk_callback_lock);
> }
>
> +static void mptcp_check_graft(struct sock *sk, struct sock *ssk)
> +{
> + struct socket *sock;
> +
> + if (ssk->sk_socket)
> + return;
> +
> + write_lock_bh(&sk->sk_callback_lock);
> + sock = sk->sk_socket;
> + write_lock_bh(&sk->sk_callback_lock);
The build job failed because here it should be the unlock version
(write_unlock_bh()).
(and probably an empty line just after).
If there is only that, I can fix that when applying the patches.
> + if (sock)
> + mptcp_sock_graft(ssk, sock);
> +}
> +
> bool mptcp_finish_join(struct sock *ssk)
> {
> struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> @@ -3758,6 +3766,7 @@ bool mptcp_finish_join(struct sock *ssk)
> }
> mptcp_subflow_joined(msk, ssk);
> spin_unlock_bh(&msk->fallback_lock);
> + mptcp_check_graft(parent, ssk);
> mptcp_propagate_sndbuf(parent, ssk);
> return true;
> }
> @@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk)
> goto err_prohibited;
> }
>
> + mptcp_check_graft(parent, ssk);
Is it OK to graft it even in case of errors in __mptcp_finish_join()?
i.e. not in an established state or !msk->allow_subflows.
Should it not be done after the block below, if there was no error?
> +
> /* If we can't acquire msk socket lock here, let the release callback
> * handle it
> */
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
On 11/13/25 9:47 AM, Matthieu Baerts wrote:
> Hi Paolo,
>
> On 13/11/2025 01:10, Paolo Abeni wrote:
>> If a passive MPTCP socket creates active subflows while still unaccepted,
>> __mptcp_subflow_connect() will try to graft such subflows to the msk,
>> but the msk struct socket is not yet initialized at that point:
>> the subflows will misbehave.
>
> What kind of errors were visible?
Found by code inspection, never replicated in practice; thinking again
about it. I'm no more sure this is actually needed: the subflow
mentioned above should be catched and fixed at accept time.
/me needs more coffee and thinking...
>> +static void mptcp_check_graft(struct sock *sk, struct sock *ssk)
>> +{
>> + struct socket *sock;
>> +
>> + if (ssk->sk_socket)
>> + return;
>> +
>> + write_lock_bh(&sk->sk_callback_lock);
>> + sock = sk->sk_socket;
>> + write_lock_bh(&sk->sk_callback_lock);
>
> The build job failed because here it should be the unlock version
> (write_unlock_bh()).
> (and probably an empty line just after).
Oh, it looks like I shared an older revision, sorry.
> If there is only that, I can fix that when applying the patches.
>
>> + if (sock)
>> + mptcp_sock_graft(ssk, sock);
>> +}
>> +
>> bool mptcp_finish_join(struct sock *ssk)
>> {
>> struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
>> @@ -3758,6 +3766,7 @@ bool mptcp_finish_join(struct sock *ssk)
>> }
>> mptcp_subflow_joined(msk, ssk);
>> spin_unlock_bh(&msk->fallback_lock);
>> + mptcp_check_graft(parent, ssk);
>> mptcp_propagate_sndbuf(parent, ssk);
>> return true;
>> }
>> @@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk)
>> goto err_prohibited;
>> }
>>
>> + mptcp_check_graft(parent, ssk);
>
> Is it OK to graft it even in case of errors in __mptcp_finish_join()?
> i.e. not in an established state or !msk->allow_subflows.
>
> Should it not be done after the block below, if there was no error?
My understanding/hope is that grafting early don't cause issues:
- no UaF access to ssk->sk_socket, because ssk either do finish
successfully the join or is reset/delete very soon.
- no ref leak, because ssk does not acquire any actual reference on
msk/socket.
It's needed to do it in mptcp_finish_join(), to ensure that in patch 3/3
all the subflows will always push accounted data after that
mptcp_graft_subflows() completes - or more specifically the
`backlog_unaccounted` counter is flushed.
Let me see if I can rework 3/3 without this patch
/P
Hi Paolo, On 13/11/2025 18:09, Paolo Abeni wrote: > On 11/13/25 9:47 AM, Matthieu Baerts wrote: >> On 13/11/2025 01:10, Paolo Abeni wrote: (...) >>> @@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk) >>> goto err_prohibited; >>> } >>> >>> + mptcp_check_graft(parent, ssk); >> >> Is it OK to graft it even in case of errors in __mptcp_finish_join()? >> i.e. not in an established state or !msk->allow_subflows. >> >> Should it not be done after the block below, if there was no error? > > My understanding/hope is that grafting early don't cause issues: > - no UaF access to ssk->sk_socket, because ssk either do finish > successfully the join or is reset/delete very soon. > - no ref leak, because ssk does not acquire any actual reference on > msk/socket. > > It's needed to do it in mptcp_finish_join(), to ensure that in patch 3/3 > all the subflows will always push accounted data after that > mptcp_graft_subflows() completes - or more specifically the > `backlog_unaccounted` counter is flushed. > > Let me see if I can rework 3/3 without this patch We don't need to drop this patch if it fixes thing. I was only asking some questions to understand the behavioural change, and if that was not going to introduce different issues :) Cheers, Matt -- Sponsored by the NGI0 Core fund.
© 2016 - 2025 Red Hat, Inc.