[PATCH mptcp-net] mptcp: fix connect timeout handling

Paolo Abeni posted 1 patch 11 months, 3 weeks ago
Failed in applying to current master (apply log)
Maintainers: Matthieu Baerts <matthieu.baerts@tessares.net>, Mat Martineau <martineau@kernel.org>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
net/mptcp/protocol.c | 29 +++++++----------------------
net/mptcp/protocol.h |  1 -
2 files changed, 7 insertions(+), 23 deletions(-)
[PATCH mptcp-net] mptcp: fix connect timeout handling
Posted by Paolo Abeni 11 months, 3 weeks ago
Ondrej reported a functional issue WRT timeout handling on connect
with a nice reproducer.

The problem is that the current mptcp connect waits for both the
MPTCP socket level timeout, and the first subflow socket timeout.
The latter is not influenced/touched by the exposed setsockopt().

Overall the above makes the SO_SNDTIMEO a no-op on connect.

Since mptcp_connect is invoked via inet_stream_connect and the
latter properly handle the MPTCP level timeout, we can address the
issue making the nested subflow level connect always unblocking.

This also allow simplifying a bit the code, dropping an ugly hack
to handle the fastopen and custom proto_ops connect.

The issues predates the blamed commit below, but the current resolution
requires the infrastructure introduced there.

Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
Note: there is needed follow-up correctly propagating the SYN_SENT
-> CLOSE state transition from the first subflow to the msk socket in
case of connect error, but that is somewhat independed/pre-existent.

Sending out early for testing.
---
 net/mptcp/protocol.c | 29 +++++++----------------------
 net/mptcp/protocol.h |  1 -
 2 files changed, 7 insertions(+), 23 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f5842f93c891..4747058a8b6b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1734,7 +1734,6 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg,
 
 	lock_sock(ssk);
 	msg->msg_flags |= MSG_DONTWAIT;
-	msk->connect_flags = O_NONBLOCK;
 	msk->fastopening = 1;
 	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, len, NULL);
 	msk->fastopening = 0;
@@ -3672,9 +3671,9 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	 * acquired the subflow socket lock, too.
 	 */
 	if (msk->fastopening)
-		err = __inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags, 1);
+		err = __inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK, 1);
 	else
-		err = inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags);
+		err = inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK);
 	inet_sk(sk)->defer_connect = inet_sk(ssock->sk)->defer_connect;
 
 	/* on successful connect, the msk state will be moved to established by
@@ -3687,12 +3686,10 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 
 	mptcp_copy_inaddrs(sk, ssock->sk);
 
-	/* unblocking connect, mptcp-level inet_stream_connect will error out
-	 * without changing the socket state, update it here.
+	/* silence EINPROGRESS and let the caller inet_stream_connect
+	 * handle the connection in progress
 	 */
-	if (err == -EINPROGRESS)
-		sk->sk_socket->state = ssock->state;
-	return err;
+	return 0;
 }
 
 static struct proto mptcp_prot = {
@@ -3751,18 +3748,6 @@ static int mptcp_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	return err;
 }
 
-static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr,
-				int addr_len, int flags)
-{
-	int ret;
-
-	lock_sock(sock->sk);
-	mptcp_sk(sock->sk)->connect_flags = flags;
-	ret = __inet_stream_connect(sock, uaddr, addr_len, flags, 0);
-	release_sock(sock->sk);
-	return ret;
-}
-
 static int mptcp_listen(struct socket *sock, int backlog)
 {
 	struct mptcp_sock *msk = mptcp_sk(sock->sk);
@@ -3917,7 +3902,7 @@ static const struct proto_ops mptcp_stream_ops = {
 	.owner		   = THIS_MODULE,
 	.release	   = inet_release,
 	.bind		   = mptcp_bind,
-	.connect	   = mptcp_stream_connect,
+	.connect	   = inet_stream_connect,
 	.socketpair	   = sock_no_socketpair,
 	.accept		   = mptcp_stream_accept,
 	.getname	   = inet_getname,
@@ -4012,7 +3997,7 @@ static const struct proto_ops mptcp_v6_stream_ops = {
 	.owner		   = THIS_MODULE,
 	.release	   = inet6_release,
 	.bind		   = mptcp_bind,
-	.connect	   = mptcp_stream_connect,
+	.connect	   = inet_stream_connect,
 	.socketpair	   = sock_no_socketpair,
 	.accept		   = mptcp_stream_accept,
 	.getname	   = inet6_getname,
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 8d9d87e2f840..39613bba1c46 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -300,7 +300,6 @@ struct mptcp_sock {
 			nodelay:1,
 			fastopening:1,
 			in_accept_queue:1;
-	int		connect_flags;
 	struct work_struct work;
 	struct sk_buff  *ooo_last_skb;
 	struct rb_root  out_of_order_queue;
-- 
2.40.0
Re: [PATCH mptcp-net] mptcp: fix connect timeout handling
Posted by Matthieu Baerts 11 months, 2 weeks ago
Hi Paolo, Mat,

On 05/05/2023 21:34, Paolo Abeni wrote:
> Ondrej reported a functional issue WRT timeout handling on connect
> with a nice reproducer.
> 
> The problem is that the current mptcp connect waits for both the
> MPTCP socket level timeout, and the first subflow socket timeout.
> The latter is not influenced/touched by the exposed setsockopt().
> 
> Overall the above makes the SO_SNDTIMEO a no-op on connect.
> 
> Since mptcp_connect is invoked via inet_stream_connect and the
> latter properly handle the MPTCP level timeout, we can address the
> issue making the nested subflow level connect always unblocking.
> 
> This also allow simplifying a bit the code, dropping an ugly hack
> to handle the fastopen and custom proto_ops connect.
> 
> The issues predates the blamed commit below, but the current resolution
> requires the infrastructure introduced there.
> 
> Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Thank you for the fix and the review!

I just applied it in our repo with Mat's RvB tag. I also added Ondrej's
Reported-by tag.

New patches for t/upstream-net and t/upstream:
- 59d33b72ee59: mptcp: fix connect timeout handling
- 8d8fd087a52c: tg:msg: add Ondrej's Reported-by tag
- Results: 8ecd3b5ba105..fa4a5780caa4 (export-net)
- Results: d9d25c574c4e..b8e056f78785 (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20230515T155446
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20230515T155446

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net
Re: [PATCH mptcp-net] mptcp: fix connect timeout handling
Posted by Mat Martineau 11 months, 2 weeks ago
On Fri, 5 May 2023, Paolo Abeni wrote:

> Ondrej reported a functional issue WRT timeout handling on connect
> with a nice reproducer.
>
> The problem is that the current mptcp connect waits for both the
> MPTCP socket level timeout, and the first subflow socket timeout.
> The latter is not influenced/touched by the exposed setsockopt().
>
> Overall the above makes the SO_SNDTIMEO a no-op on connect.
>
> Since mptcp_connect is invoked via inet_stream_connect and the
> latter properly handle the MPTCP level timeout, we can address the
> issue making the nested subflow level connect always unblocking.
>
> This also allow simplifying a bit the code, dropping an ugly hack
> to handle the fastopen and custom proto_ops connect.
>
> The issues predates the blamed commit below, but the current resolution
> requires the infrastructure introduced there.
>
> Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> Note: there is needed follow-up correctly propagating the SYN_SENT
> -> CLOSE state transition from the first subflow to the msk socket in
> case of connect error, but that is somewhat independed/pre-existent.
>
> Sending out early for testing.

Hi Paolo -

Patch LGTM, and appreciate the cleanup of msk->connect_flags:

Reviewed-by: Mat Martineau <martineau@kernel.org>

I think it would be good to also add a selftest in net-next to check the 
connect timeout (set to a very short time), but that doesn't need to block 
this -net change.

Thanks,
Mat

> ---
> net/mptcp/protocol.c | 29 +++++++----------------------
> net/mptcp/protocol.h |  1 -
> 2 files changed, 7 insertions(+), 23 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index f5842f93c891..4747058a8b6b 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1734,7 +1734,6 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg,
>
> 	lock_sock(ssk);
> 	msg->msg_flags |= MSG_DONTWAIT;
> -	msk->connect_flags = O_NONBLOCK;
> 	msk->fastopening = 1;
> 	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, len, NULL);
> 	msk->fastopening = 0;
> @@ -3672,9 +3671,9 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> 	 * acquired the subflow socket lock, too.
> 	 */
> 	if (msk->fastopening)
> -		err = __inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags, 1);
> +		err = __inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK, 1);
> 	else
> -		err = inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags);
> +		err = inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK);
> 	inet_sk(sk)->defer_connect = inet_sk(ssock->sk)->defer_connect;
>
> 	/* on successful connect, the msk state will be moved to established by
> @@ -3687,12 +3686,10 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
>
> 	mptcp_copy_inaddrs(sk, ssock->sk);
>
> -	/* unblocking connect, mptcp-level inet_stream_connect will error out
> -	 * without changing the socket state, update it here.
> +	/* silence EINPROGRESS and let the caller inet_stream_connect
> +	 * handle the connection in progress
> 	 */
> -	if (err == -EINPROGRESS)
> -		sk->sk_socket->state = ssock->state;
> -	return err;
> +	return 0;
> }
>
> static struct proto mptcp_prot = {
> @@ -3751,18 +3748,6 @@ static int mptcp_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
> 	return err;
> }
>
> -static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr,
> -				int addr_len, int flags)
> -{
> -	int ret;
> -
> -	lock_sock(sock->sk);
> -	mptcp_sk(sock->sk)->connect_flags = flags;
> -	ret = __inet_stream_connect(sock, uaddr, addr_len, flags, 0);
> -	release_sock(sock->sk);
> -	return ret;
> -}
> -
> static int mptcp_listen(struct socket *sock, int backlog)
> {
> 	struct mptcp_sock *msk = mptcp_sk(sock->sk);
> @@ -3917,7 +3902,7 @@ static const struct proto_ops mptcp_stream_ops = {
> 	.owner		   = THIS_MODULE,
> 	.release	   = inet_release,
> 	.bind		   = mptcp_bind,
> -	.connect	   = mptcp_stream_connect,
> +	.connect	   = inet_stream_connect,
> 	.socketpair	   = sock_no_socketpair,
> 	.accept		   = mptcp_stream_accept,
> 	.getname	   = inet_getname,
> @@ -4012,7 +3997,7 @@ static const struct proto_ops mptcp_v6_stream_ops = {
> 	.owner		   = THIS_MODULE,
> 	.release	   = inet6_release,
> 	.bind		   = mptcp_bind,
> -	.connect	   = mptcp_stream_connect,
> +	.connect	   = inet_stream_connect,
> 	.socketpair	   = sock_no_socketpair,
> 	.accept		   = mptcp_stream_accept,
> 	.getname	   = inet6_getname,
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 8d9d87e2f840..39613bba1c46 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -300,7 +300,6 @@ struct mptcp_sock {
> 			nodelay:1,
> 			fastopening:1,
> 			in_accept_queue:1;
> -	int		connect_flags;
> 	struct work_struct work;
> 	struct sk_buff  *ooo_last_skb;
> 	struct rb_root  out_of_order_queue;
> -- 
> 2.40.0
>
>
>
Re: [PATCH mptcp-net] mptcp: fix connect timeout handling
Posted by Paolo Abeni 11 months, 2 weeks ago
On Fri, 2023-05-12 at 12:24 -0700, Mat Martineau wrote:
> On Fri, 5 May 2023, Paolo Abeni wrote:
> 
> > Ondrej reported a functional issue WRT timeout handling on connect
> > with a nice reproducer.
> > 
> > The problem is that the current mptcp connect waits for both the
> > MPTCP socket level timeout, and the first subflow socket timeout.
> > The latter is not influenced/touched by the exposed setsockopt().
> > 
> > Overall the above makes the SO_SNDTIMEO a no-op on connect.
> > 
> > Since mptcp_connect is invoked via inet_stream_connect and the
> > latter properly handle the MPTCP level timeout, we can address the
> > issue making the nested subflow level connect always unblocking.
> > 
> > This also allow simplifying a bit the code, dropping an ugly hack
> > to handle the fastopen and custom proto_ops connect.
> > 
> > The issues predates the blamed commit below, but the current resolution
> > requires the infrastructure introduced there.
> > 
> > Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> > Note: there is needed follow-up correctly propagating the SYN_SENT
> > -> CLOSE state transition from the first subflow to the msk socket in
> > case of connect error, but that is somewhat independed/pre-existent.
> > 
> > Sending out early for testing.
> 
> Hi Paolo -
> 
> Patch LGTM, and appreciate the cleanup of msk->connect_flags:
> 
> Reviewed-by: Mat Martineau <martineau@kernel.org>
> 
> I think it would be good to also add a selftest in net-next to check the 
> connect timeout (set to a very short time), but that doesn't need to block 
> this -net change.

Agreed. Perhaps a pktdrill script would could be simpler option?!?

Thanks!

Paolo
Re: [PATCH mptcp-net] mptcp: fix connect timeout handling
Posted by Matthieu Baerts 11 months, 2 weeks ago
Hi Paolo, Mat,

On 15/05/2023 10:16, Paolo Abeni wrote:
> On Fri, 2023-05-12 at 12:24 -0700, Mat Martineau wrote:
>> On Fri, 5 May 2023, Paolo Abeni wrote:
>>
>>> Ondrej reported a functional issue WRT timeout handling on connect
>>> with a nice reproducer.
>>>
>>> The problem is that the current mptcp connect waits for both the
>>> MPTCP socket level timeout, and the first subflow socket timeout.
>>> The latter is not influenced/touched by the exposed setsockopt().
>>>
>>> Overall the above makes the SO_SNDTIMEO a no-op on connect.
>>>
>>> Since mptcp_connect is invoked via inet_stream_connect and the
>>> latter properly handle the MPTCP level timeout, we can address the
>>> issue making the nested subflow level connect always unblocking.
>>>
>>> This also allow simplifying a bit the code, dropping an ugly hack
>>> to handle the fastopen and custom proto_ops connect.
>>>
>>> The issues predates the blamed commit below, but the current resolution
>>> requires the infrastructure introduced there.
>>>
>>> Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>> ---
>>> Note: there is needed follow-up correctly propagating the SYN_SENT
>>> -> CLOSE state transition from the first subflow to the msk socket in
>>> case of connect error, but that is somewhat independed/pre-existent.
>>>
>>> Sending out early for testing.
>>
>> Hi Paolo -
>>
>> Patch LGTM, and appreciate the cleanup of msk->connect_flags:
>>
>> Reviewed-by: Mat Martineau <martineau@kernel.org>
>>
>> I think it would be good to also add a selftest in net-next to check the 
>> connect timeout (set to a very short time), but that doesn't need to block 
>> this -net change.
> 
> Agreed. Perhaps a pktdrill script would could be simpler option?!?

Good idea, that seems more appropriated, especially with the timeout part.

Just to avoid deadlocks: is there someone working on that? :)

But as Mat said, this is not blocking, I can apply the patch.

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net
Re: [PATCH mptcp-net] mptcp: fix connect timeout handling
Posted by Paolo Abeni 11 months, 2 weeks ago
On Mon, 2023-05-15 at 17:46 +0200, Matthieu Baerts wrote:
> Hi Paolo, Mat,
> 
> On 15/05/2023 10:16, Paolo Abeni wrote:
> > On Fri, 2023-05-12 at 12:24 -0700, Mat Martineau wrote:
> > > On Fri, 5 May 2023, Paolo Abeni wrote:
> > > 
> > > > Ondrej reported a functional issue WRT timeout handling on connect
> > > > with a nice reproducer.
> > > > 
> > > > The problem is that the current mptcp connect waits for both the
> > > > MPTCP socket level timeout, and the first subflow socket timeout.
> > > > The latter is not influenced/touched by the exposed setsockopt().
> > > > 
> > > > Overall the above makes the SO_SNDTIMEO a no-op on connect.
> > > > 
> > > > Since mptcp_connect is invoked via inet_stream_connect and the
> > > > latter properly handle the MPTCP level timeout, we can address the
> > > > issue making the nested subflow level connect always unblocking.
> > > > 
> > > > This also allow simplifying a bit the code, dropping an ugly hack
> > > > to handle the fastopen and custom proto_ops connect.
> > > > 
> > > > The issues predates the blamed commit below, but the current resolution
> > > > requires the infrastructure introduced there.
> > > > 
> > > > Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
> > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
> > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > > > ---
> > > > Note: there is needed follow-up correctly propagating the SYN_SENT
> > > > -> CLOSE state transition from the first subflow to the msk socket in
> > > > case of connect error, but that is somewhat independed/pre-existent.
> > > > 
> > > > Sending out early for testing.
> > > 
> > > Hi Paolo -
> > > 
> > > Patch LGTM, and appreciate the cleanup of msk->connect_flags:
> > > 
> > > Reviewed-by: Mat Martineau <martineau@kernel.org>
> > > 
> > > I think it would be good to also add a selftest in net-next to check the 
> > > connect timeout (set to a very short time), but that doesn't need to block 
> > > this -net change.
> > 
> > Agreed. Perhaps a pktdrill script would could be simpler option?!?
> 
> Good idea, that seems more appropriated, especially with the timeout part.
> 
> Just to avoid deadlocks: is there someone working on that? :)

Yep, just opended a PR:

https://github.com/multipath-tcp/packetdrill/pull/115

/P
Re: [PATCH mptcp-net] mptcp: fix connect timeout handling
Posted by Paolo Abeni 11 months, 3 weeks ago
On Fri, 2023-05-05 at 21:34 +0200, Paolo Abeni wrote:
> Note: there is needed follow-up correctly propagating the SYN_SENT
> -> CLOSE state transition from the first subflow to the msk socket in
> case of connect error, but that is somewhat independed/pre-existent.

Re-reading the relevant code, there is no need for such follow-up:
commit 1249db44a102 ("mptcp: be careful on subflow status propagation
on errors") already take care of it.

Cheers,

Paolo