[PATCH mptcp-net] mptcp: fallback earlier on simult connection

Paolo Abeni posted 1 patch 2 weeks, 1 day ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/36b926c5d55933a6f2d2a7f6ff8d8a091c0de719.1764332508.git.pabeni@redhat.com
net/mptcp/options.c  | 10 ++++++++++
net/mptcp/protocol.h | 11 ++---------
net/mptcp/subflow.c  |  6 ------
3 files changed, 12 insertions(+), 15 deletions(-)
[PATCH mptcp-net] mptcp: fallback earlier on simult connection
Posted by Paolo Abeni 2 weeks, 1 day ago
Syzkaller reports a simult-connect race leading to inconsistent fallback
status:

WARNING: CPU: 3 PID: 33 at net/mptcp/subflow.c:1515 subflow_data_ready+0x40b/0x7c0 net/mptcp/subflow.c:1515
Modules linked in:
CPU: 3 UID: 0 PID: 33 Comm: ksoftirqd/3 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:subflow_data_ready+0x40b/0x7c0 net/mptcp/subflow.c:1515
Code: 89 ee e8 78 61 3c f6 40 84 ed 75 21 e8 8e 66 3c f6 44 89 fe bf 07 00 00 00 e8 c1 61 3c f6 41 83 ff 07 74 09 e8 76 66 3c f6 90 <0f> 0b 90 e8 6d 66 3c f6 48 89 df e8 e5 ad ff ff 31 ff 89 c5 89 c6
RSP: 0018:ffffc900006cf338 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888031acd100 RCX: ffffffff8b7f2abf
RDX: ffff88801e6ea440 RSI: ffffffff8b7f2aca RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007
R10: 0000000000000004 R11: 0000000000002c10 R12: ffff88802ba69900
R13: 1ffff920000d9e67 R14: ffff888046f81800 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff8880d69bc000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560fc0ca1670 CR3: 0000000032c3a000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 tcp_data_queue+0x13b0/0x4f90 net/ipv4/tcp_input.c:5197
 tcp_rcv_state_process+0xfdf/0x4ec0 net/ipv4/tcp_input.c:6922
 tcp_v6_do_rcv+0x492/0x1740 net/ipv6/tcp_ipv6.c:1672
 tcp_v6_rcv+0x2976/0x41e0 net/ipv6/tcp_ipv6.c:1918
 ip6_protocol_deliver_rcu+0x188/0x1520 net/ipv6/ip6_input.c:438
 ip6_input_finish+0x1e4/0x4b0 net/ipv6/ip6_input.c:489
 NF_HOOK include/linux/netfilter.h:318 [inline]
 NF_HOOK include/linux/netfilter.h:312 [inline]
 ip6_input+0x105/0x2f0 net/ipv6/ip6_input.c:500
 dst_input include/net/dst.h:471 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline]
 NF_HOOK include/linux/netfilter.h:318 [inline]
 NF_HOOK include/linux/netfilter.h:312 [inline]
 ipv6_rcv+0x264/0x650 net/ipv6/ip6_input.c:311
 __netif_receive_skb_one_core+0x12d/0x1e0 net/core/dev.c:5979
 __netif_receive_skb+0x1d/0x160 net/core/dev.c:6092
 process_backlog+0x442/0x15e0 net/core/dev.c:6444
 __napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7494
 napi_poll net/core/dev.c:7557 [inline]
 net_rx_action+0xa9f/0xfe0 net/core/dev.c:7684
 handle_softirqs+0x216/0x8e0 kernel/softirq.c:579
 run_ksoftirqd kernel/softirq.c:968 [inline]
 run_ksoftirqd+0x3a/0x60 kernel/softirq.c:960
 smpboot_thread_fn+0x3f7/0xae0 kernel/smpboot.c:160
 kthread+0x3c2/0x780 kernel/kthread.c:463
 ret_from_fork+0x5d7/0x6f0 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

The TCP subflow can process the simult-connect syn-ack packet after
transitioning to TCP_FIN1 state, bypassing the MPTCP fallback check,
as the sk_state_change() callback is not invoked for * -> FIN_WAIT1
transitions.

That will move the msk socket to an inconsistent status and the next
incoming data will hit the reported splat.

Close the race moving the simult-fallback check at the earliest possible
stage - that is at syn-ack generation time.

Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().")
Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
Reported-by: syzbot+0ff6b771b4f7a5bce83b@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/586
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
@mattbe: could you please test vs yours syz repro?

simult connect pkt drill test will need a paired change; not reporting
the full diff to avoid confusing the patch importer:

 +0  >  S  0:0(0)                    <mss 1460, sackOK, TS val 100 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
 +0  <  S  0:0(0)         win 1000   <mss 1460, sackOK, TS val 407 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
-+0  >  S. 0:0(0)  ack 1             <mss 1460, sackOK, TS val 330 ecr 407, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
-+0  <  S. 0:0(0)  ack 1  win 65535  <mss 1460, sackOK, TS val 507 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]>
++0  >  S. 0:0(0)  ack 1             <mss 1460, sackOK, TS val 330 ecr 407, nop,wscale 8>
++0  <  S. 0:0(0)  ack 1  win 65535  <mss 1460, sackOK, TS val 507 ecr 100, nop,wscale 8>
 +0  >   . 1:1(0)  ack 1             <nop,      nop,    TS val 430 ecr 507, nop, nop, sack 0:1>
---
 net/mptcp/options.c  | 10 ++++++++++
 net/mptcp/protocol.h | 11 ++---------
 net/mptcp/subflow.c  |  6 ------
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index ff2b9fc7c01f..ac16e4bd496f 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -408,6 +408,16 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
 	 */
 	subflow->snd_isn = TCP_SKB_CB(skb)->end_seq;
 	if (subflow->request_mptcp) {
+		if (unlikely(subflow_simultaneous_connect(sk))) {
+			WARN_ON_ONCE(!mptcp_try_fallback(sk, MPTCP_MIB_SIMULTCONNFALLBACK));
+
+			/* Ensure mptcp_finish_connect() will not process the
+			 * MPC handshake.
+			 */
+			subflow->request_mptcp = 0;
+			return false;
+		}
+
 		opts->suboptions = OPTION_MPTCP_MPC_SYN;
 		opts->csum_reqd = mptcp_is_checksum_enabled(sock_net(sk));
 		opts->allow_join_id0 = mptcp_allow_join_id0(sock_net(sk));
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index bc470254bd6b..b995b009f31d 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -1325,19 +1325,12 @@ static inline bool mptcp_check_infinite_map(struct sk_buff *skb)
 	return false;
 }
 
-static inline bool is_active_ssk(struct mptcp_subflow_context *subflow)
-{
-	return (subflow->request_mptcp || subflow->request_join);
-}
-
 static inline bool subflow_simultaneous_connect(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 
-	return (1 << sk->sk_state) &
-	       (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING) &&
-	       is_active_ssk(subflow) &&
-	       !subflow->conn_finished;
+	/* Note that the sk state implies !subflow->conn_finished. */
+	return sk->sk_state == TCP_SYN_RECV && subflow->request_mptcp;
 }
 
 #ifdef CONFIG_SYN_COOKIES
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 86ce58ae533d..96d54cb2cd93 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1878,12 +1878,6 @@ static void subflow_state_change(struct sock *sk)
 
 	__subflow_state_change(sk);
 
-	if (subflow_simultaneous_connect(sk)) {
-		WARN_ON_ONCE(!mptcp_try_fallback(sk, MPTCP_MIB_SIMULTCONNFALLBACK));
-		subflow->conn_finished = 1;
-		mptcp_propagate_state(parent, sk, subflow, NULL);
-	}
-
 	/* as recvmsg() does not acquire the subflow socket for ssk selection
 	 * a fin packet carrying a DSS can be unnoticed if we don't trigger
 	 * the data available machinery here.
-- 
2.52.0
Re: [PATCH mptcp-net] mptcp: fallback earlier on simult connection
Posted by Matthieu Baerts 1 week, 5 days ago
Hi Paolo,

On 28/11/2025 13:22, Paolo Abeni wrote:
> Syzkaller reports a simult-connect race leading to inconsistent fallback
> status:

(...)

> The TCP subflow can process the simult-connect syn-ack packet after
> transitioning to TCP_FIN1 state, bypassing the MPTCP fallback check,
> as the sk_state_change() callback is not invoked for * -> FIN_WAIT1
> transitions.
> 
> That will move the msk socket to an inconsistent status and the next
> incoming data will hit the reported splat.
> 
> Close the race moving the simult-fallback check at the earliest possible
> stage - that is at syn-ack generation time.

Good idea!

> Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().")

From what I understand, the modification on TCP side is needed for this
fix. When reading its commit message, it sounds it should have contained
a Fixes tag, but net-next was chosen to delay the fix. Is that correct?

If yes, should I ask to backport this other commit with this patch? Or
should this patch be backported only up to stable trees already having
this other commit?

> Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
> Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")> Reported-by: syzbot+0ff6b771b4f7a5bce83b@syzkaller.appspotmail.com
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/586
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> @mattbe: could you please test vs yours syz repro?

In progress!
> simult connect pkt drill test will need a paired change; not reporting
> the full diff to avoid confusing the patch importer:
> 
>  +0  >  S  0:0(0)                    <mss 1460, sackOK, TS val 100 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
>  +0  <  S  0:0(0)         win 1000   <mss 1460, sackOK, TS val 407 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
> -+0  >  S. 0:0(0)  ack 1             <mss 1460, sackOK, TS val 330 ecr 407, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
> -+0  <  S. 0:0(0)  ack 1  win 65535  <mss 1460, sackOK, TS val 507 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]>
> ++0  >  S. 0:0(0)  ack 1             <mss 1460, sackOK, TS val 330 ecr 407, nop,wscale 8>
> ++0  <  S. 0:0(0)  ack 1  win 65535  <mss 1460, sackOK, TS val 507 ecr 100, nop,wscale 8>
>  +0  >   . 1:1(0)  ack 1             <nop,      nop,    TS val 430 ecr 507, nop, nop, sack 0:1>

Thank you! Confirmed!

(...)

> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index bc470254bd6b..b995b009f31d 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -1325,19 +1325,12 @@ static inline bool mptcp_check_infinite_map(struct sk_buff *skb)
>  	return false;
>  }
>  
> -static inline bool is_active_ssk(struct mptcp_subflow_context *subflow)
> -{
> -	return (subflow->request_mptcp || subflow->request_join);

Out of curiosity, was this "subflow->request_join" not there to catch
simultaneous MPJoin connect?

Is it not needed to catch this case, and drop the second join?

https://datatracker.ietf.org/doc/html/rfc8684#section-3.9.2-8

(Or maybe the TCP code handles that as mentioned in the RFC?)

> -}
> -
>  static inline bool subflow_simultaneous_connect(struct sock *sk)
>  {
>  	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
>  
> -	return (1 << sk->sk_state) &
> -	       (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING) &&
> -	       is_active_ssk(subflow) &&
> -	       !subflow->conn_finished;
> +	/* Note that the sk state implies !subflow->conn_finished. */
> +	return sk->sk_state == TCP_SYN_RECV && subflow->request_mptcp;

Detail: do we still require this helper? It is only used in one place,
already under "subflow->request_mptcp". But we can keep it if it is
easier / clearer.

>  }
>  
>  #ifdef CONFIG_SYN_COOKIES
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 86ce58ae533d..96d54cb2cd93 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -1878,12 +1878,6 @@ static void subflow_state_change(struct sock *sk)
>  
>  	__subflow_state_change(sk);
>  
> -	if (subflow_simultaneous_connect(sk)) {
> -		WARN_ON_ONCE(!mptcp_try_fallback(sk, MPTCP_MIB_SIMULTCONNFALLBACK));
> -		subflow->conn_finished = 1;
> -		mptcp_propagate_state(parent, sk, subflow, NULL);
> -	}
> -
>  	/* as recvmsg() does not acquire the subflow socket for ssk selection
>  	 * a fin packet carrying a DSS can be unnoticed if we don't trigger
>  	 * the data available machinery here.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-net] mptcp: fallback earlier on simult connection
Posted by Matthieu Baerts 1 week, 4 days ago
Hi Paolo,

On 01/12/2025 18:35, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 28/11/2025 13:22, Paolo Abeni wrote:
>> Syzkaller reports a simult-connect race leading to inconsistent fallback
>> status:
> 
> (...)
> 
>> The TCP subflow can process the simult-connect syn-ack packet after
>> transitioning to TCP_FIN1 state, bypassing the MPTCP fallback check,
>> as the sk_state_change() callback is not invoked for * -> FIN_WAIT1
>> transitions.
>>
>> That will move the msk socket to an inconsistent status and the next
>> incoming data will hit the reported splat.
>>
>> Close the race moving the simult-fallback check at the earliest possible
>> stage - that is at syn-ack generation time.
> 
> Good idea!
> 
>> Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().")
> 
> From what I understand, the modification on TCP side is needed for this
> fix. When reading its commit message, it sounds it should have contained
> a Fixes tag, but net-next was chosen to delay the fix. Is that correct?
> 
> If yes, should I ask to backport this other commit with this patch? Or
> should this patch be backported only up to stable trees already having
> this other commit?
> 
>> Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
>> Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")> Reported-by: syzbot+0ff6b771b4f7a5bce83b@syzkaller.appspotmail.com
>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/586
>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>> ---
>> @mattbe: could you please test vs yours syz repro?
> 
> In progress!

Sorry for the delay, I had some issues with my setup.

I can still reproduce with your patch and my reproducer. I have added
more details on:

  https://github.com/multipath-tcp/mptcp_net-next/issues/586

I can easily reproduce it now, and the issue might be slightly different
from the one identified by syzbot.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-net] mptcp: fallback earlier on simult connection
Posted by MPTCP CI 2 weeks, 1 day ago
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_syscalls selftest_simult_flows 🔴
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_syscalls 🔴
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Unstable: 1 failed test(s): bpftest_test_progs_mptcp 🔴
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19764025304

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/23fd9c22f527
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1028652


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Re: [PATCH mptcp-net] mptcp: fallback earlier on simult connection
Posted by Paolo Abeni 1 week, 5 days ago
On 11/28/25 2:35 PM, MPTCP CI wrote:
> Thank you for your modifications, that's great!
> 
> Our CI did some validations and here is its report:
> 
> - KVM Validation: normal (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_syscalls selftest_simult_flows 🔴
> - KVM Validation: normal (only selftest_mptcp_join): Success! ✅
> - KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_syscalls 🔴
> - KVM Validation: debug (only selftest_mptcp_join): Success! ✅
> - KVM Validation: btf-normal (only bpftest_all): Success! ✅
> - KVM Validation: btf-debug (only bpftest_all): Unstable: 1 failed test(s): bpftest_test_progs_mptcp 🔴
> - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19764025304

As mentioned, pktdrill failures are expected - test need to be updated.

The simult flows and btf failures are unexpected, but looks unrelated
from this patch.

/P

Re: [PATCH mptcp-net] mptcp: fallback earlier on simult connection
Posted by Matthieu Baerts 1 week, 5 days ago
Hi Paolo,

On 01/12/2025 12:10, Paolo Abeni wrote:
> On 11/28/25 2:35 PM, MPTCP CI wrote:
>> Thank you for your modifications, that's great!
>>
>> Our CI did some validations and here is its report:
>>
>> - KVM Validation: normal (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_syscalls selftest_simult_flows 🔴
>> - KVM Validation: normal (only selftest_mptcp_join): Success! ✅
>> - KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_syscalls 🔴
>> - KVM Validation: debug (only selftest_mptcp_join): Success! ✅
>> - KVM Validation: btf-normal (only bpftest_all): Success! ✅
>> - KVM Validation: btf-debug (only bpftest_all): Unstable: 1 failed test(s): bpftest_test_progs_mptcp 🔴
>> - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19764025304
> 
> As mentioned, pktdrill failures are expected - test need to be updated.
> 
> The simult flows and btf failures are unexpected, but looks unrelated
> from this patch.

Thank you for having checked! (Sorry, I'm lagging a bit behind)

Regarding the simult_flows tests, the last job checking the sync with
net/net-next also has a failure:

https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19812973527

So not a regression due to your patch, but I hope there is not one due
to other changes!

(The last job also mentioned issues with a new packetdrill test, I'm
going to look at it, then apply your patch!)

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.