MPTCP uses a TCP SYN packet with the MP_JOIN option to add subflows to
an existing MPTCP connection. This uses token lookup to match the
incoming SYN with the appropriate MPTCP socket instead of the port
number (in accordance with RFC 8684), and it is possible for a TCP
listener to be present at the port that the MPTCP peer is trying to
connect to. If the TCP listener ignores the MP_JOIN option and sends a
normal TCP SYN-ACK without a MP_JOIN option header, the peer will be
forced to send a RST.
Given that a peer sending MP_JOIN is guaranteed to reject a regular TCP
SYN-ACK, it's better to attempt a MPTCP token lookup and either let
MPTCP continue the handshake or immediately reject the connection
attempt.
The only new conditional in the TCP SYN handling hot path is a check of
the saw_mp_join bit. If that is set, tcp_check_req() returns early and
tcp_v4_rcv() or tcp_v6_rcv() call the MPTCP join handler in an error
path.
If MPTCP is not configured, all the code added in this commit should be
optimized away.
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
net/ipv4/tcp_ipv4.c | 7 +++++++
net/ipv4/tcp_minisocks.c | 8 +++++++-
net/ipv6/tcp_ipv6.c | 7 +++++++
3 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 73e5726af8d0..124ad393e6fb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2082,6 +2082,13 @@ int tcp_v4_rcv(struct sk_buff *skb)
tcp_v4_restore_cb(skb);
sock_put(sk);
goto lookup;
+ } else if (req_status == TCP_REQ_MP_JOIN) {
+ struct sock *msk = mptcp_handle_join4(skb);
+
+ if (msk) {
+ sk = msk;
+ goto process;
+ }
}
goto discard_and_relse;
}
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 3d989f8676a5..4fdca31b33bb 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -572,10 +572,16 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
bool own_req;
tmp_opt.saw_tstamp = 0;
+ tmp_opt.saw_mp_join = 0;
if (th->doff > (sizeof(struct tcphdr)>>2)) {
tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL);
- if (tmp_opt.saw_tstamp) {
+ if (unlikely(tmp_opt.saw_mp_join) &&
+ mptcp_is_enabled(sock_net(sk))) {
+ *req_status = TCP_REQ_MP_JOIN;
+
+ return NULL;
+ } else if (tmp_opt.saw_tstamp) {
tmp_opt.ts_recent = req->ts_recent;
if (tmp_opt.rcv_tsecr)
tmp_opt.rcv_tsecr -= tcp_rsk(req)->ts_off;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 4706aad6dd8f..d46fa8243e83 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1731,6 +1731,13 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
tcp_v6_restore_cb(skb);
sock_put(sk);
goto lookup;
+ } else if (req_status == TCP_REQ_MP_JOIN) {
+ struct sock *msk = mptcp_handle_join6(skb);
+
+ if (msk) {
+ sk = msk;
+ goto process;
+ }
}
goto discard_and_relse;
}
--
2.35.1
On Thu, 2022-02-24 at 15:22 -0800, Mat Martineau wrote:
> MPTCP uses a TCP SYN packet with the MP_JOIN option to add subflows to
> an existing MPTCP connection. This uses token lookup to match the
> incoming SYN with the appropriate MPTCP socket instead of the port
> number (in accordance with RFC 8684), and it is possible for a TCP
> listener to be present at the port that the MPTCP peer is trying to
> connect to. If the TCP listener ignores the MP_JOIN option and sends a
> normal TCP SYN-ACK without a MP_JOIN option header, the peer will be
> forced to send a RST.
>
> Given that a peer sending MP_JOIN is guaranteed to reject a regular TCP
> SYN-ACK, it's better to attempt a MPTCP token lookup and either let
> MPTCP continue the handshake or immediately reject the connection
> attempt.
>
> The only new conditional in the TCP SYN handling hot path is a check of
> the saw_mp_join bit. If that is set, tcp_check_req() returns early and
> tcp_v4_rcv() or tcp_v6_rcv() call the MPTCP join handler in an error
> path.
>
> If MPTCP is not configured, all the code added in this commit should be
> optimized away.
>
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
> net/ipv4/tcp_ipv4.c | 7 +++++++
> net/ipv4/tcp_minisocks.c | 8 +++++++-
> net/ipv6/tcp_ipv6.c | 7 +++++++
> 3 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 73e5726af8d0..124ad393e6fb 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2082,6 +2082,13 @@ int tcp_v4_rcv(struct sk_buff *skb)
> tcp_v4_restore_cb(skb);
> sock_put(sk);
> goto lookup;
> + } else if (req_status == TCP_REQ_MP_JOIN) {
> + struct sock *msk = mptcp_handle_join4(skb);
> +
> + if (msk) {
> + sk = msk;
> + goto process;
> + }
If I read correctly, this will handle MPJ 3rd ack packets, we will need
a similar chunk for incoming MPJ syn in tcp_rcv_state_process(), and
likely something additional for syncookies.
> }
> goto discard_and_relse;
> }
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 3d989f8676a5..4fdca31b33bb 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -572,10 +572,16 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
> bool own_req;
>
> tmp_opt.saw_tstamp = 0;
> + tmp_opt.saw_mp_join = 0;
> if (th->doff > (sizeof(struct tcphdr)>>2)) {
> tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL);
>
> - if (tmp_opt.saw_tstamp) {
> + if (unlikely(tmp_opt.saw_mp_join) &&
> + mptcp_is_enabled(sock_net(sk))) {
> + *req_status = TCP_REQ_MP_JOIN;
If I read correctly, this skips all the state and sequence number check
we have on plain TCP 3rd ack. I think we want to preserve them.
Cheers,
Paolo
On Fri, 25 Feb 2022, Paolo Abeni wrote:
> On Thu, 2022-02-24 at 15:22 -0800, Mat Martineau wrote:
>> MPTCP uses a TCP SYN packet with the MP_JOIN option to add subflows to
>> an existing MPTCP connection. This uses token lookup to match the
>> incoming SYN with the appropriate MPTCP socket instead of the port
>> number (in accordance with RFC 8684), and it is possible for a TCP
>> listener to be present at the port that the MPTCP peer is trying to
>> connect to. If the TCP listener ignores the MP_JOIN option and sends a
>> normal TCP SYN-ACK without a MP_JOIN option header, the peer will be
>> forced to send a RST.
>>
>> Given that a peer sending MP_JOIN is guaranteed to reject a regular TCP
>> SYN-ACK, it's better to attempt a MPTCP token lookup and either let
>> MPTCP continue the handshake or immediately reject the connection
>> attempt.
>>
>> The only new conditional in the TCP SYN handling hot path is a check of
>> the saw_mp_join bit. If that is set, tcp_check_req() returns early and
>> tcp_v4_rcv() or tcp_v6_rcv() call the MPTCP join handler in an error
>> path.
>>
>> If MPTCP is not configured, all the code added in this commit should be
>> optimized away.
>>
>> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>> ---
>> net/ipv4/tcp_ipv4.c | 7 +++++++
>> net/ipv4/tcp_minisocks.c | 8 +++++++-
>> net/ipv6/tcp_ipv6.c | 7 +++++++
>> 3 files changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
>> index 73e5726af8d0..124ad393e6fb 100644
>> --- a/net/ipv4/tcp_ipv4.c
>> +++ b/net/ipv4/tcp_ipv4.c
>> @@ -2082,6 +2082,13 @@ int tcp_v4_rcv(struct sk_buff *skb)
>> tcp_v4_restore_cb(skb);
>> sock_put(sk);
>> goto lookup;
>> + } else if (req_status == TCP_REQ_MP_JOIN) {
>> + struct sock *msk = mptcp_handle_join4(skb);
>> +
>> + if (msk) {
>> + sk = msk;
>> + goto process;
>> + }
>
> If I read correctly, this will handle MPJ 3rd ack packets, we will need
> a similar chunk for incoming MPJ syn in tcp_rcv_state_process(), and
> likely something additional for syncookies.
>
I think you're correct here and below. In trying to put a quick RFC
together, I was thinking about what happens on this code path when trying
to get the TCP listener to reject, but skipped over the use of this code
by the MPTCP subflow listener. Handling both cases does make this approach
more intrusive to the TCP code.
>> }
>> goto discard_and_relse;
>> }
>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
>> index 3d989f8676a5..4fdca31b33bb 100644
>> --- a/net/ipv4/tcp_minisocks.c
>> +++ b/net/ipv4/tcp_minisocks.c
>> @@ -572,10 +572,16 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
>> bool own_req;
>>
>> tmp_opt.saw_tstamp = 0;
>> + tmp_opt.saw_mp_join = 0;
>> if (th->doff > (sizeof(struct tcphdr)>>2)) {
>> tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL);
>>
>> - if (tmp_opt.saw_tstamp) {
>> + if (unlikely(tmp_opt.saw_mp_join) &&
>> + mptcp_is_enabled(sock_net(sk))) {
>> + *req_status = TCP_REQ_MP_JOIN;
>
> If I read correctly, this skips all the state and sequence number check
> we have on plain TCP 3rd ack. I think we want to preserve them.
--
Mat Martineau
Intel
On Wed, 2 Mar 2022, Mat Martineau wrote:
> On Fri, 25 Feb 2022, Paolo Abeni wrote:
>
>> On Thu, 2022-02-24 at 15:22 -0800, Mat Martineau wrote:
>>> MPTCP uses a TCP SYN packet with the MP_JOIN option to add subflows to
>>> an existing MPTCP connection. This uses token lookup to match the
>>> incoming SYN with the appropriate MPTCP socket instead of the port
>>> number (in accordance with RFC 8684), and it is possible for a TCP
>>> listener to be present at the port that the MPTCP peer is trying to
>>> connect to. If the TCP listener ignores the MP_JOIN option and sends a
>>> normal TCP SYN-ACK without a MP_JOIN option header, the peer will be
>>> forced to send a RST.
>>>
>>> Given that a peer sending MP_JOIN is guaranteed to reject a regular TCP
>>> SYN-ACK, it's better to attempt a MPTCP token lookup and either let
>>> MPTCP continue the handshake or immediately reject the connection
>>> attempt.
>>>
>>> The only new conditional in the TCP SYN handling hot path is a check of
>>> the saw_mp_join bit. If that is set, tcp_check_req() returns early and
>>> tcp_v4_rcv() or tcp_v6_rcv() call the MPTCP join handler in an error
>>> path.
>>>
>>> If MPTCP is not configured, all the code added in this commit should be
>>> optimized away.
>>>
>>> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>>> ---
>>> net/ipv4/tcp_ipv4.c | 7 +++++++
>>> net/ipv4/tcp_minisocks.c | 8 +++++++-
>>> net/ipv6/tcp_ipv6.c | 7 +++++++
>>> 3 files changed, 21 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
>>> index 73e5726af8d0..124ad393e6fb 100644
>>> --- a/net/ipv4/tcp_ipv4.c
>>> +++ b/net/ipv4/tcp_ipv4.c
>>> @@ -2082,6 +2082,13 @@ int tcp_v4_rcv(struct sk_buff *skb)
>>> tcp_v4_restore_cb(skb);
>>> sock_put(sk);
>>> goto lookup;
>>> + } else if (req_status == TCP_REQ_MP_JOIN) {
>>> + struct sock *msk = mptcp_handle_join4(skb);
>>> +
>>> + if (msk) {
>>> + sk = msk;
>>> + goto process;
>>> + }
>>
>> If I read correctly, this will handle MPJ 3rd ack packets, we will need
>> a similar chunk for incoming MPJ syn in tcp_rcv_state_process(), and
>> likely something additional for syncookies.
>>
>
> I think you're correct here and below. In trying to put a quick RFC
> together, I was thinking about what happens on this code path when
> trying to get the TCP listener to reject, but skipped over the use of
> this code by the MPTCP subflow listener. Handling both cases does make
> this approach more intrusive to the TCP code.
>
...on second thought, maybe not much more intrusive?
>>> }
>>> goto discard_and_relse;
>>> }
>>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
>>> index 3d989f8676a5..4fdca31b33bb 100644
>>> --- a/net/ipv4/tcp_minisocks.c
>>> +++ b/net/ipv4/tcp_minisocks.c
>>> @@ -572,10 +572,16 @@ struct sock *tcp_check_req(struct sock *sk, struct
>>> sk_buff *skb,
>>> bool own_req;
>>>
>>> tmp_opt.saw_tstamp = 0;
>>> + tmp_opt.saw_mp_join = 0;
>>> if (th->doff > (sizeof(struct tcphdr)>>2)) {
>>> tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL);
>>>
>>> - if (tmp_opt.saw_tstamp) {
>>> + if (unlikely(tmp_opt.saw_mp_join) &&
Make sure this only triggers on a regular TCP listener:
+ !sk_is_mptcp(sk) &&
>>> + mptcp_is_enabled(sock_net(sk))) {
>>> + *req_status = TCP_REQ_MP_JOIN;
Then the TCP_REQ_MP_JOIN status is only ever encountered on a plain TCP
listener where the MP_JOIN header is present. This was my original intent
but the above conditional was buggy.
>>
>> If I read correctly, this skips all the state and sequence number check
>> we have on plain TCP 3rd ack. I think we want to preserve them.
>
For plain TCP 3rd ack, there's no MP_JOIN option and tcp_check_req()
behavior is unchanged.
For a MP_JOIN SYN being processed by a plain TCP listener, this returns
early, then the token lookup is attempted as if there had been no TCP
listener to begin with. If there's a token match then the MPTCP listener
will do all the checks like it did before.
--
Mat Martineau
Intel
Hi Mat, Thank you for your modifications, that's great! But sadly, our CI spotted some issues with it when trying to build it. You can find more details there: https://patchwork.kernel.org/project/mptcp/patch/20220224232226.259304-5-mathew.j.martineau@linux.intel.com/ https://github.com/multipath-tcp/mptcp_net-next/actions/runs/1895883332 Status: failure Initiator: MPTCPimporter Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/0f58cdeb67d0 Feel free to reply to this email if you cannot access logs, if you need some support to fix the error, if this doesn't seem to be caused by your modifications or if the error is a false positive one. Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (Tessares)
Hi Mat, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: Script error! ❓: - : - Task: https://cirrus-ci.com/task/6161624997298176 - Summary: https://api.cirrus-ci.com/v1/artifact/task/6161624997298176/summary/summary.txt - KVM Validation: Script error! ❓: - : - Task: https://cirrus-ci.com/task/5598675043876864 - Summary: https://api.cirrus-ci.com/v1/artifact/task/5598675043876864/summary/summary.txt Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/0f58cdeb67d0 Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (Tessares)
© 2016 - 2026 Red Hat, Inc.