[v3] mptcp: cg and backlog follow-up

[PATCH v3 mptcp-net 1/3] mptcp: fix grafting corner case

Posted by Paolo Abeni 2 weeks ago

If a passive MPTCP socket creates active subflows while still unaccepted,
__mptcp_subflow_connect() will try to graft such subflows to the msk,
but the msk struct socket is not yet initialized at that point:
the subflows will misbehave.

Address the issue always trying to graft the subflow in
mptcp_finish_join(), regardless of the subflow itself being active or
passive. To avoid races with accept(), access the msk->sk_socket under
the callback lock.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 8965abb94b81..1b3c5fd01600 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -913,12 +913,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
 	mptcp_subflow_joined(msk, ssk);
 	spin_unlock_bh(&msk->fallback_lock);
 
-	/* attach to msk socket only after we are sure we will deal with it
-	 * at close time
-	 */
-	if (sk->sk_socket && !ssk->sk_socket)
-		mptcp_sock_graft(ssk, sk->sk_socket);
-
 	mptcp_subflow_ctx(ssk)->subflow_id = msk->subflow_id++;
 	mptcp_sockopt_sync_locked(msk, ssk);
 	mptcp_stop_tout_timer(sk);
@@ -3734,6 +3728,20 @@ void mptcp_sock_graft(struct sock *sk, struct socket *parent)
 	write_unlock_bh(&sk->sk_callback_lock);
 }
 
+static void mptcp_check_graft(struct sock *sk, struct sock *ssk)
+{
+	struct socket *sock;
+
+	if (ssk->sk_socket)
+		return;
+
+	write_lock_bh(&sk->sk_callback_lock);
+	sock = sk->sk_socket;
+	write_lock_bh(&sk->sk_callback_lock);
+	if (sock)
+		mptcp_sock_graft(ssk, sock);
+}
+
 bool mptcp_finish_join(struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
@@ -3758,6 +3766,7 @@ bool mptcp_finish_join(struct sock *ssk)
 		}
 		mptcp_subflow_joined(msk, ssk);
 		spin_unlock_bh(&msk->fallback_lock);
+		mptcp_check_graft(parent, ssk);
 		mptcp_propagate_sndbuf(parent, ssk);
 		return true;
 	}
@@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk)
 		goto err_prohibited;
 	}
 
+	mptcp_check_graft(parent, ssk);
+
 	/* If we can't acquire msk socket lock here, let the release callback
 	 * handle it
 	 */
-- 
2.51.1

Re: [PATCH v3 mptcp-net 1/3] mptcp: fix grafting corner case

Posted by Matthieu Baerts 2 weeks ago

Hi Paolo,

On 13/11/2025 01:10, Paolo Abeni wrote:
> If a passive MPTCP socket creates active subflows while still unaccepted,
> __mptcp_subflow_connect() will try to graft such subflows to the msk,
> but the msk struct socket is not yet initialized at that point:
> the subflows will misbehave.

What kind of errors were visible?

> Address the issue always trying to graft the subflow in
> mptcp_finish_join(), regardless of the subflow itself being active or
> passive. To avoid races with accept(), access the msk->sk_socket under
> the callback lock.

Thank you for addressing this issue!

> 

By chance, do you have a Fixes tag to add here? (if it is for -net)

> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/mptcp/protocol.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 8965abb94b81..1b3c5fd01600 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -913,12 +913,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
>  	mptcp_subflow_joined(msk, ssk);
>  	spin_unlock_bh(&msk->fallback_lock);
>  
> -	/* attach to msk socket only after we are sure we will deal with it
> -	 * at close time
> -	 */
> -	if (sk->sk_socket && !ssk->sk_socket)
> -		mptcp_sock_graft(ssk, sk->sk_socket);
> -
>  	mptcp_subflow_ctx(ssk)->subflow_id = msk->subflow_id++;
>  	mptcp_sockopt_sync_locked(msk, ssk);
>  	mptcp_stop_tout_timer(sk);
> @@ -3734,6 +3728,20 @@ void mptcp_sock_graft(struct sock *sk, struct socket *parent)
>  	write_unlock_bh(&sk->sk_callback_lock);
>  }
>  
> +static void mptcp_check_graft(struct sock *sk, struct sock *ssk)
> +{
> +	struct socket *sock;
> +
> +	if (ssk->sk_socket)
> +		return;
> +
> +	write_lock_bh(&sk->sk_callback_lock);
> +	sock = sk->sk_socket;
> +	write_lock_bh(&sk->sk_callback_lock);

The build job failed because here it should be the unlock version
(write_unlock_bh()).
(and probably an empty line just after).

If there is only that, I can fix that when applying the patches.

> +	if (sock)
> +		mptcp_sock_graft(ssk, sock);
> +}
> +
>  bool mptcp_finish_join(struct sock *ssk)
>  {
>  	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> @@ -3758,6 +3766,7 @@ bool mptcp_finish_join(struct sock *ssk)
>  		}
>  		mptcp_subflow_joined(msk, ssk);
>  		spin_unlock_bh(&msk->fallback_lock);
> +		mptcp_check_graft(parent, ssk);
>  		mptcp_propagate_sndbuf(parent, ssk);
>  		return true;
>  	}
> @@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk)
>  		goto err_prohibited;
>  	}
>  
> +	mptcp_check_graft(parent, ssk);

Is it OK to graft it even in case of errors in __mptcp_finish_join()?
i.e. not in an established state or !msk->allow_subflows.

Should it not be done after the block below, if there was no error?

> +
>  	/* If we can't acquire msk socket lock here, let the release callback
>  	 * handle it
>  	 */

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH v3 mptcp-net 1/3] mptcp: fix grafting corner case

Posted by Paolo Abeni 1 week, 6 days ago

On 11/13/25 9:47 AM, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 13/11/2025 01:10, Paolo Abeni wrote:
>> If a passive MPTCP socket creates active subflows while still unaccepted,
>> __mptcp_subflow_connect() will try to graft such subflows to the msk,
>> but the msk struct socket is not yet initialized at that point:
>> the subflows will misbehave.
> 
> What kind of errors were visible?

Found by code inspection, never replicated in practice; thinking again
about it. I'm no more sure this is actually needed: the subflow
mentioned above should be catched and fixed at accept time.

/me needs more coffee and thinking...
>> +static void mptcp_check_graft(struct sock *sk, struct sock *ssk)
>> +{
>> +	struct socket *sock;
>> +
>> +	if (ssk->sk_socket)
>> +		return;
>> +
>> +	write_lock_bh(&sk->sk_callback_lock);
>> +	sock = sk->sk_socket;
>> +	write_lock_bh(&sk->sk_callback_lock);
> 
> The build job failed because here it should be the unlock version
> (write_unlock_bh()).
> (and probably an empty line just after).
Oh, it looks like I shared an older revision, sorry.

> If there is only that, I can fix that when applying the patches.
> 
>> +	if (sock)
>> +		mptcp_sock_graft(ssk, sock);
>> +}
>> +
>>  bool mptcp_finish_join(struct sock *ssk)
>>  {
>>  	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
>> @@ -3758,6 +3766,7 @@ bool mptcp_finish_join(struct sock *ssk)
>>  		}
>>  		mptcp_subflow_joined(msk, ssk);
>>  		spin_unlock_bh(&msk->fallback_lock);
>> +		mptcp_check_graft(parent, ssk);
>>  		mptcp_propagate_sndbuf(parent, ssk);
>>  		return true;
>>  	}
>> @@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk)
>>  		goto err_prohibited;
>>  	}
>>  
>> +	mptcp_check_graft(parent, ssk);
> 
> Is it OK to graft it even in case of errors in __mptcp_finish_join()?
> i.e. not in an established state or !msk->allow_subflows.
> 
> Should it not be done after the block below, if there was no error?

My understanding/hope is that grafting early don't cause issues:
- no UaF access to ssk->sk_socket, because ssk either do finish
successfully the join or is reset/delete very soon.
- no ref leak, because ssk does not acquire any actual reference on
msk/socket.

It's needed to do it in mptcp_finish_join(), to ensure that in patch 3/3
all the subflows will always push accounted data after that
mptcp_graft_subflows() completes - or more specifically the
`backlog_unaccounted` counter is flushed.

Let me see if I can rework 3/3 without this patch

/P

Re: [PATCH v3 mptcp-net 1/3] mptcp: fix grafting corner case

Posted by Matthieu Baerts 1 week, 6 days ago

Hi Paolo,

On 13/11/2025 18:09, Paolo Abeni wrote:
> On 11/13/25 9:47 AM, Matthieu Baerts wrote:
>> On 13/11/2025 01:10, Paolo Abeni wrote:

(...)

>>> @@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk)
>>>  		goto err_prohibited;
>>>  	}
>>>  
>>> +	mptcp_check_graft(parent, ssk);
>>
>> Is it OK to graft it even in case of errors in __mptcp_finish_join()?
>> i.e. not in an established state or !msk->allow_subflows.
>>
>> Should it not be done after the block below, if there was no error?
> 
> My understanding/hope is that grafting early don't cause issues:
> - no UaF access to ssk->sk_socket, because ssk either do finish
> successfully the join or is reset/delete very soon.
> - no ref leak, because ssk does not acquire any actual reference on
> msk/socket.
> 
> It's needed to do it in mptcp_finish_join(), to ensure that in patch 3/3
> all the subflows will always push accounted data after that
> mptcp_graft_subflows() completes - or more specifically the
> `backlog_unaccounted` counter is flushed.
> 
> Let me see if I can rework 3/3 without this patch

We don't need to drop this patch if it fixes thing. I was only asking
some questions to understand the behavioural change, and if that was not
going to introduce different issues :)

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.