[PATCH v2 mptcp-net 2/2] mptcp: fix duplicate reset on fastclose

Paolo Abeni posted 2 patches 2 weeks, 6 days ago
[PATCH v2 mptcp-net 2/2] mptcp: fix duplicate reset on fastclose
Posted by Paolo Abeni 2 weeks, 6 days ago
The CI reports sporadic failures of the fastclose self-tests. The root
cause is a duplicate reset, not carrying the relevant MPTCP option.
In the failing scenario the bad reset is received by the peer before
the fastclose one, preventing the reception of the latter.

Indeed there is window of opportunity at fastclose time for the following
race:

mptcp_do_fastclose
  __mptcp_close_ssk
    __tcp_close()
      tcp_set_state() [1]
      tcp_send_active_reset() [2]

After [1] the stack will send reset to in-flight data reaching the now
closed port. Such reset may race with [2].

Address the issue explicitly sending a single reset on fastclose before
explicitly moving the subflow to close status.

Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios");
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/596
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v1 -> v2:
 - test subflow->send_fastclose in __mptcp_subflow_disconnect() instead
   of MPTCP_CF_FASTCLOSE
---
 net/mptcp/protocol.c | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0301e0b0de05..24d4fa8227b7 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2437,7 +2437,6 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
 
 /* flags for __mptcp_close_ssk() */
 #define MPTCP_CF_PUSH		BIT(1)
-#define MPTCP_CF_FASTCLOSE	BIT(2)
 
 /* be sure to send a reset only if the caller asked for it, also
  * clean completely the subflow status when the subflow reaches
@@ -2448,7 +2447,7 @@ static void __mptcp_subflow_disconnect(struct sock *ssk,
 				       unsigned int flags)
 {
 	if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
-	    (flags & MPTCP_CF_FASTCLOSE)) {
+	    subflow->send_fastclose) {
 		/* The MPTCP code never wait on the subflow sockets, TCP-level
 		 * disconnect should never fail
 		 */
@@ -2511,20 +2510,13 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 	if (dispose_it)
 		list_del(&subflow->node);
 
-	if ((flags & MPTCP_CF_FASTCLOSE) && !__mptcp_check_fallback(msk)) {
-		/* be sure to force the tcp_close path
-		 * to generate the egress reset
-		 */
-		ssk->sk_lingertime = 0;
-		sock_set_flag(ssk, SOCK_LINGER);
-		subflow->send_fastclose = 1;
-	}
+	if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE)
+		tcp_set_state(ssk, TCP_CLOSE);
 
 	need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk);
 	if (!dispose_it) {
 		__mptcp_subflow_disconnect(ssk, subflow, flags);
 		release_sock(ssk);
-
 		goto out;
 	}
 
@@ -2855,9 +2847,26 @@ static void mptcp_do_fastclose(struct sock *sk)
 
 	mptcp_set_state(sk, TCP_CLOSE);
 	mptcp_backlog_purge(sk);
-	mptcp_for_each_subflow_safe(msk, subflow, tmp)
-		__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow),
-				  subflow, MPTCP_CF_FASTCLOSE);
+
+	/* Explicitly send the fastclose reset as need */
+	if (__mptcp_check_fallback(msk))
+		return;
+
+	mptcp_for_each_subflow_safe(msk, subflow, tmp) {
+		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+		lock_sock(ssk);
+
+		/* Some subflow socket states don't allow/need a reset.*/
+		if ((1 << ssk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))
+			goto unlock;
+
+		subflow->send_fastclose = 1;
+		tcp_send_active_reset(ssk, ssk->sk_allocation,
+				      SK_RST_REASON_TCP_ABORT_ON_CLOSE);
+unlock:
+		release_sock(ssk);
+	}
 }
 
 static void mptcp_worker(struct work_struct *work)
-- 
2.51.0
Re: [PATCH v2 mptcp-net 2/2] mptcp: fix duplicate reset on fastclose
Posted by Matthieu Baerts 2 weeks, 2 days ago
Hi Paolo,

On 07/11/2025 08:23, Paolo Abeni wrote:
> The CI reports sporadic failures of the fastclose self-tests. The root
> cause is a duplicate reset, not carrying the relevant MPTCP option.
> In the failing scenario the bad reset is received by the peer before
> the fastclose one, preventing the reception of the latter.
> 
> Indeed there is window of opportunity at fastclose time for the following
> race:
> 
> mptcp_do_fastclose
>   __mptcp_close_ssk
>     __tcp_close()
>       tcp_set_state() [1]
>       tcp_send_active_reset() [2]
> 
> After [1] the stack will send reset to in-flight data reaching the now
> closed port. Such reset may race with [2].
> 
> Address the issue explicitly sending a single reset on fastclose before
> explicitly moving the subflow to close status.

Thank you for the proper fix!

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios");

When applying this patch, I will remove the extra ';' at the end of this
line, plus not removing a line below.

I will also revert "selftests: mptcp: join: fastclose: drop plain RST"
and only keep the removal of the two 'MPTCP_LIB_SUBTEST_FLAKY=1' lines.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH v2 mptcp-net 2/2] mptcp: fix duplicate reset on fastclose
Posted by Geliang Tang 2 weeks, 2 days ago
Hi Paolo,

Thanks for this v2.

On Fri, 2025-11-07 at 08:23 +0100, Paolo Abeni wrote:
> The CI reports sporadic failures of the fastclose self-tests. The
> root
> cause is a duplicate reset, not carrying the relevant MPTCP option.
> In the failing scenario the bad reset is received by the peer before
> the fastclose one, preventing the reception of the latter.
> 
> Indeed there is window of opportunity at fastclose time for the
> following
> race:
> 
> mptcp_do_fastclose
>   __mptcp_close_ssk
>     __tcp_close()
>       tcp_set_state() [1]
>       tcp_send_active_reset() [2]
> 
> After [1] the stack will send reset to in-flight data reaching the
> now
> closed port. Such reset may race with [2].
> 
> Address the issue explicitly sending a single reset on fastclose
> before
> explicitly moving the subflow to close status.
> 
> Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios");
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/596
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> v1 -> v2:
>  - test subflow->send_fastclose in __mptcp_subflow_disconnect()
> instead
>    of MPTCP_CF_FASTCLOSE
> ---
>  net/mptcp/protocol.c | 37 +++++++++++++++++++++++--------------
>  1 file changed, 23 insertions(+), 14 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 0301e0b0de05..24d4fa8227b7 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2437,7 +2437,6 @@ bool __mptcp_retransmit_pending_data(struct
> sock *sk)
>  
>  /* flags for __mptcp_close_ssk() */
>  #define MPTCP_CF_PUSH		BIT(1)
> -#define MPTCP_CF_FASTCLOSE	BIT(2)
>  
>  /* be sure to send a reset only if the caller asked for it, also
>   * clean completely the subflow status when the subflow reaches
> @@ -2448,7 +2447,7 @@ static void __mptcp_subflow_disconnect(struct
> sock *ssk,
>  				       unsigned int flags)

nit:

The 3rd argument "flags" of __mptcp_subflow_disconnect is useless now.
We can drop it.

No need to send v3, Matt or I can handle it.

Thanks,
-Geliang

>  {
>  	if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
> -	    (flags & MPTCP_CF_FASTCLOSE)) {
> +	    subflow->send_fastclose) {
>  		/* The MPTCP code never wait on the subflow sockets,
> TCP-level
>  		 * disconnect should never fail
>  		 */
> @@ -2511,20 +2510,13 @@ static void __mptcp_close_ssk(struct sock
> *sk, struct sock *ssk,
>  	if (dispose_it)
>  		list_del(&subflow->node);
>  
> -	if ((flags & MPTCP_CF_FASTCLOSE) &&
> !__mptcp_check_fallback(msk)) {
> -		/* be sure to force the tcp_close path
> -		 * to generate the egress reset
> -		 */
> -		ssk->sk_lingertime = 0;
> -		sock_set_flag(ssk, SOCK_LINGER);
> -		subflow->send_fastclose = 1;
> -	}
> +	if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE)
> +		tcp_set_state(ssk, TCP_CLOSE);
>  
>  	need_push = (flags & MPTCP_CF_PUSH) &&
> __mptcp_retransmit_pending_data(sk);
>  	if (!dispose_it) {
>  		__mptcp_subflow_disconnect(ssk, subflow, flags);
>  		release_sock(ssk);
> -
>  		goto out;
>  	}
>  
> @@ -2855,9 +2847,26 @@ static void mptcp_do_fastclose(struct sock
> *sk)
>  
>  	mptcp_set_state(sk, TCP_CLOSE);
>  	mptcp_backlog_purge(sk);
> -	mptcp_for_each_subflow_safe(msk, subflow, tmp)
> -		__mptcp_close_ssk(sk,
> mptcp_subflow_tcp_sock(subflow),
> -				  subflow, MPTCP_CF_FASTCLOSE);
> +
> +	/* Explicitly send the fastclose reset as need */
> +	if (__mptcp_check_fallback(msk))
> +		return;
> +
> +	mptcp_for_each_subflow_safe(msk, subflow, tmp) {
> +		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> +
> +		lock_sock(ssk);
> +
> +		/* Some subflow socket states don't allow/need a
> reset.*/
> +		if ((1 << ssk->sk_state) & (TCPF_LISTEN |
> TCPF_CLOSE))
> +			goto unlock;
> +
> +		subflow->send_fastclose = 1;
> +		tcp_send_active_reset(ssk, ssk->sk_allocation,
> +				     
> SK_RST_REASON_TCP_ABORT_ON_CLOSE);
> +unlock:
> +		release_sock(ssk);
> +	}
>  }
>  
>  static void mptcp_worker(struct work_struct *work)

Re: [PATCH v2 mptcp-net 2/2] mptcp: fix duplicate reset on fastclose
Posted by Matthieu Baerts 2 weeks, 2 days ago
Hi Geliang,

11 Nov 2025 02:59:28 Geliang Tang <geliang@kernel.org>:

> Hi Paolo,
>
> Thanks for this v2.
>
> On Fri, 2025-11-07 at 08:23 +0100, Paolo Abeni wrote:
>> The CI reports sporadic failures of the fastclose self-tests. The
>> root
>> cause is a duplicate reset, not carrying the relevant MPTCP option.
>> In the failing scenario the bad reset is received by the peer before
>> the fastclose one, preventing the reception of the latter.
>>
>> Indeed there is window of opportunity at fastclose time for the
>> following
>> race:
>>
>> mptcp_do_fastclose
>>   __mptcp_close_ssk
>>     __tcp_close()
>>       tcp_set_state() [1]
>>       tcp_send_active_reset() [2]
>>
>> After [1] the stack will send reset to in-flight data reaching the
>> now
>> closed port. Such reset may race with [2].
>>
>> Address the issue explicitly sending a single reset on fastclose
>> before
>> explicitly moving the subflow to close status.
>>
>> Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios");
>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/596
>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>> ---
>> v1 -> v2:
>>  - test subflow->send_fastclose in __mptcp_subflow_disconnect()
>> instead
>>    of MPTCP_CF_FASTCLOSE
>> ---
>>  net/mptcp/protocol.c | 37 +++++++++++++++++++++++--------------
>>  1 file changed, 23 insertions(+), 14 deletions(-)
>>
>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>> index 0301e0b0de05..24d4fa8227b7 100644
>> --- a/net/mptcp/protocol.c
>> +++ b/net/mptcp/protocol.c
>> @@ -2437,7 +2437,6 @@ bool __mptcp_retransmit_pending_data(struct
>> sock *sk)
>>  
>>  /* flags for __mptcp_close_ssk() */
>>  #define MPTCP_CF_PUSH      BIT(1)
>> -#define MPTCP_CF_FASTCLOSE BIT(2)
>>  
>>  /* be sure to send a reset only if the caller asked for it, also
>>   * clean completely the subflow status when the subflow reaches
>> @@ -2448,7 +2447,7 @@ static void __mptcp_subflow_disconnect(struct
>> sock *ssk,
>>                        unsigned int flags)
>
> nit:
>
> The 3rd argument "flags" of __mptcp_subflow_disconnect is useless now.
> We can drop it.

I don't think it is useless: __mptcp_close_ssk() is still called with either
the push flag or no flag (0).

Because this patch is for net, we should avoid unnecessary modifications:
we should not change the type or the name of the function argument for
"cosmetic" reasons.

>
> No need to send v3, Matt or I can handle it.
>
> Thanks,
> -Geliang
>
>>  {
>>     if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
>> -       (flags & MPTCP_CF_FASTCLOSE)) {
>> +       subflow->send_fastclose) {
>>         /* The MPTCP code never wait on the subflow sockets,
>> TCP-level
>>          * disconnect should never fail
>>          */
>> @@ -2511,20 +2510,13 @@ static void __mptcp_close_ssk(struct sock
>> *sk, struct sock *ssk,
>>     if (dispose_it)
>>         list_del(&subflow->node);
>>  
>> -   if ((flags & MPTCP_CF_FASTCLOSE) &&
>> !__mptcp_check_fallback(msk)) {
>> -       /* be sure to force the tcp_close path
>> -        * to generate the egress reset
>> -        */
>> -       ssk->sk_lingertime = 0;
>> -       sock_set_flag(ssk, SOCK_LINGER);
>> -       subflow->send_fastclose = 1;
>> -   }
>> +   if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE)
>> +       tcp_set_state(ssk, TCP_CLOSE);
>>  
>>     need_push = (flags & MPTCP_CF_PUSH) &&
>> __mptcp_retransmit_pending_data(sk);
>>     if (!dispose_it) {
>>         __mptcp_subflow_disconnect(ssk, subflow, flags);
>>         release_sock(ssk);
>> -

(This line should not be removed but this can be fixed when applying the patch.)

Cheers,
Matt
Re: [PATCH v2 mptcp-net 2/2] mptcp: fix duplicate reset on fastclose
Posted by Paolo Abeni 2 weeks, 2 days ago
On 11/11/25 7:14 AM, Matthieu Baerts wrote:
> 11 Nov 2025 02:59:28 Geliang Tang <geliang@kernel.org>:
>> On Fri, 2025-11-07 at 08:23 +0100, Paolo Abeni wrote:
>>> @@ -2448,7 +2447,7 @@ static void __mptcp_subflow_disconnect(struct
>>> sock *ssk,
>>>                        unsigned int flags)
>>
>> nit:
>>
>> The 3rd argument "flags" of __mptcp_subflow_disconnect is useless now.
>> We can drop it.
> 
> I don't think it is useless: __mptcp_close_ssk() is still called with either
> the push flag or no flag (0).
> 
> Because this patch is for net, we should avoid unnecessary modifications:
> we should not change the type or the name of the function argument for
> "cosmetic" reasons.

FWIW, I agree with Mat.

>>> @@ -2511,20 +2510,13 @@ static void __mptcp_close_ssk(struct sock
>>> *sk, struct sock *ssk,
>>>     if (dispose_it)
>>>         list_del(&subflow->node);
>>>  
>>> -   if ((flags & MPTCP_CF_FASTCLOSE) &&
>>> !__mptcp_check_fallback(msk)) {
>>> -       /* be sure to force the tcp_close path
>>> -        * to generate the egress reset
>>> -        */
>>> -       ssk->sk_lingertime = 0;
>>> -       sock_set_flag(ssk, SOCK_LINGER);
>>> -       subflow->send_fastclose = 1;
>>> -   }
>>> +   if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE)
>>> +       tcp_set_state(ssk, TCP_CLOSE);
>>>  
>>>     need_push = (flags & MPTCP_CF_PUSH) &&
>>> __mptcp_retransmit_pending_data(sk);
>>>     if (!dispose_it) {
>>>         __mptcp_subflow_disconnect(ssk, subflow, flags);
>>>         release_sock(ssk);
>>> -
> 
> (This line should not be removed but this can be fixed when applying the patch.)

Thanks!

/P

Re: [PATCH v2 mptcp-net 2/2] mptcp: fix duplicate reset on fastclose
Posted by Geliang Tang 2 weeks, 2 days ago
On Tue, 2025-11-11 at 08:27 +0100, Paolo Abeni wrote:
> On 11/11/25 7:14 AM, Matthieu Baerts wrote:
> > 11 Nov 2025 02:59:28 Geliang Tang <geliang@kernel.org>:
> > > On Fri, 2025-11-07 at 08:23 +0100, Paolo Abeni wrote:
> > > > @@ -2448,7 +2447,7 @@ static void
> > > > __mptcp_subflow_disconnect(struct
> > > > sock *ssk,
> > > >                        unsigned int flags)
> > > 
> > > nit:
> > > 
> > > The 3rd argument "flags" of __mptcp_subflow_disconnect is useless
> > > now.
> > > We can drop it.
> > 
> > I don't think it is useless: __mptcp_close_ssk() is still called
> > with either
> > the push flag or no flag (0).
> > 
> > Because this patch is for net, we should avoid unnecessary
> > modifications:
> > we should not change the type or the name of the function argument
> > for
> > "cosmetic" reasons.
> 
> FWIW, I agree with Mat.

I agree too. Let's apply it as is.

Thanks,
-Geliang

> 
> > > > @@ -2511,20 +2510,13 @@ static void __mptcp_close_ssk(struct
> > > > sock
> > > > *sk, struct sock *ssk,
> > > >     if (dispose_it)
> > > >         list_del(&subflow->node);
> > > >  
> > > > -   if ((flags & MPTCP_CF_FASTCLOSE) &&
> > > > !__mptcp_check_fallback(msk)) {
> > > > -       /* be sure to force the tcp_close path
> > > > -        * to generate the egress reset
> > > > -        */
> > > > -       ssk->sk_lingertime = 0;
> > > > -       sock_set_flag(ssk, SOCK_LINGER);
> > > > -       subflow->send_fastclose = 1;
> > > > -   }
> > > > +   if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE)
> > > > +       tcp_set_state(ssk, TCP_CLOSE);
> > > >  
> > > >     need_push = (flags & MPTCP_CF_PUSH) &&
> > > > __mptcp_retransmit_pending_data(sk);
> > > >     if (!dispose_it) {
> > > >         __mptcp_subflow_disconnect(ssk, subflow, flags);
> > > >         release_sock(ssk);
> > > > -
> > 
> > (This line should not be removed but this can be fixed when
> > applying the patch.)
> 
> Thanks!
> 
> /P
>