net/mptcp/protocol.c | 29 +++++++---------------------- net/mptcp/protocol.h | 1 - 2 files changed, 7 insertions(+), 23 deletions(-)
Ondrej reported a functional issue WRT timeout handling on connect
with a nice reproducer.
The problem is that the current mptcp connect waits for both the
MPTCP socket level timeout, and the first subflow socket timeout.
The latter is not influenced/touched by the exposed setsockopt().
Overall the above makes the SO_SNDTIMEO a no-op on connect.
Since mptcp_connect is invoked via inet_stream_connect and the
latter properly handle the MPTCP level timeout, we can address the
issue making the nested subflow level connect always unblocking.
This also allow simplifying a bit the code, dropping an ugly hack
to handle the fastopen and custom proto_ops connect.
The issues predates the blamed commit below, but the current resolution
requires the infrastructure introduced there.
Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
Note: there is needed follow-up correctly propagating the SYN_SENT
-> CLOSE state transition from the first subflow to the msk socket in
case of connect error, but that is somewhat independed/pre-existent.
Sending out early for testing.
---
net/mptcp/protocol.c | 29 +++++++----------------------
net/mptcp/protocol.h | 1 -
2 files changed, 7 insertions(+), 23 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f5842f93c891..4747058a8b6b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1734,7 +1734,6 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg,
lock_sock(ssk);
msg->msg_flags |= MSG_DONTWAIT;
- msk->connect_flags = O_NONBLOCK;
msk->fastopening = 1;
ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, len, NULL);
msk->fastopening = 0;
@@ -3672,9 +3671,9 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
* acquired the subflow socket lock, too.
*/
if (msk->fastopening)
- err = __inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags, 1);
+ err = __inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK, 1);
else
- err = inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags);
+ err = inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK);
inet_sk(sk)->defer_connect = inet_sk(ssock->sk)->defer_connect;
/* on successful connect, the msk state will be moved to established by
@@ -3687,12 +3686,10 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
mptcp_copy_inaddrs(sk, ssock->sk);
- /* unblocking connect, mptcp-level inet_stream_connect will error out
- * without changing the socket state, update it here.
+ /* silence EINPROGRESS and let the caller inet_stream_connect
+ * handle the connection in progress
*/
- if (err == -EINPROGRESS)
- sk->sk_socket->state = ssock->state;
- return err;
+ return 0;
}
static struct proto mptcp_prot = {
@@ -3751,18 +3748,6 @@ static int mptcp_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
return err;
}
-static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr,
- int addr_len, int flags)
-{
- int ret;
-
- lock_sock(sock->sk);
- mptcp_sk(sock->sk)->connect_flags = flags;
- ret = __inet_stream_connect(sock, uaddr, addr_len, flags, 0);
- release_sock(sock->sk);
- return ret;
-}
-
static int mptcp_listen(struct socket *sock, int backlog)
{
struct mptcp_sock *msk = mptcp_sk(sock->sk);
@@ -3917,7 +3902,7 @@ static const struct proto_ops mptcp_stream_ops = {
.owner = THIS_MODULE,
.release = inet_release,
.bind = mptcp_bind,
- .connect = mptcp_stream_connect,
+ .connect = inet_stream_connect,
.socketpair = sock_no_socketpair,
.accept = mptcp_stream_accept,
.getname = inet_getname,
@@ -4012,7 +3997,7 @@ static const struct proto_ops mptcp_v6_stream_ops = {
.owner = THIS_MODULE,
.release = inet6_release,
.bind = mptcp_bind,
- .connect = mptcp_stream_connect,
+ .connect = inet_stream_connect,
.socketpair = sock_no_socketpair,
.accept = mptcp_stream_accept,
.getname = inet6_getname,
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 8d9d87e2f840..39613bba1c46 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -300,7 +300,6 @@ struct mptcp_sock {
nodelay:1,
fastopening:1,
in_accept_queue:1;
- int connect_flags;
struct work_struct work;
struct sk_buff *ooo_last_skb;
struct rb_root out_of_order_queue;
--
2.40.0
Hi Paolo, Mat, On 05/05/2023 21:34, Paolo Abeni wrote: > Ondrej reported a functional issue WRT timeout handling on connect > with a nice reproducer. > > The problem is that the current mptcp connect waits for both the > MPTCP socket level timeout, and the first subflow socket timeout. > The latter is not influenced/touched by the exposed setsockopt(). > > Overall the above makes the SO_SNDTIMEO a no-op on connect. > > Since mptcp_connect is invoked via inet_stream_connect and the > latter properly handle the MPTCP level timeout, we can address the > issue making the nested subflow level connect always unblocking. > > This also allow simplifying a bit the code, dropping an ugly hack > to handle the fastopen and custom proto_ops connect. > > The issues predates the blamed commit below, but the current resolution > requires the infrastructure introduced there. > > Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()") > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 > Signed-off-by: Paolo Abeni <pabeni@redhat.com> Thank you for the fix and the review! I just applied it in our repo with Mat's RvB tag. I also added Ondrej's Reported-by tag. New patches for t/upstream-net and t/upstream: - 59d33b72ee59: mptcp: fix connect timeout handling - 8d8fd087a52c: tg:msg: add Ondrej's Reported-by tag - Results: 8ecd3b5ba105..fa4a5780caa4 (export-net) - Results: d9d25c574c4e..b8e056f78785 (export) Tests are now in progress: https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20230515T155446 https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20230515T155446 Cheers, Matt -- Tessares | Belgium | Hybrid Access Solutions www.tessares.net
On Fri, 5 May 2023, Paolo Abeni wrote: > Ondrej reported a functional issue WRT timeout handling on connect > with a nice reproducer. > > The problem is that the current mptcp connect waits for both the > MPTCP socket level timeout, and the first subflow socket timeout. > The latter is not influenced/touched by the exposed setsockopt(). > > Overall the above makes the SO_SNDTIMEO a no-op on connect. > > Since mptcp_connect is invoked via inet_stream_connect and the > latter properly handle the MPTCP level timeout, we can address the > issue making the nested subflow level connect always unblocking. > > This also allow simplifying a bit the code, dropping an ugly hack > to handle the fastopen and custom proto_ops connect. > > The issues predates the blamed commit below, but the current resolution > requires the infrastructure introduced there. > > Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()") > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > --- > Note: there is needed follow-up correctly propagating the SYN_SENT > -> CLOSE state transition from the first subflow to the msk socket in > case of connect error, but that is somewhat independed/pre-existent. > > Sending out early for testing. Hi Paolo - Patch LGTM, and appreciate the cleanup of msk->connect_flags: Reviewed-by: Mat Martineau <martineau@kernel.org> I think it would be good to also add a selftest in net-next to check the connect timeout (set to a very short time), but that doesn't need to block this -net change. Thanks, Mat > --- > net/mptcp/protocol.c | 29 +++++++---------------------- > net/mptcp/protocol.h | 1 - > 2 files changed, 7 insertions(+), 23 deletions(-) > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index f5842f93c891..4747058a8b6b 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -1734,7 +1734,6 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, > > lock_sock(ssk); > msg->msg_flags |= MSG_DONTWAIT; > - msk->connect_flags = O_NONBLOCK; > msk->fastopening = 1; > ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, len, NULL); > msk->fastopening = 0; > @@ -3672,9 +3671,9 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) > * acquired the subflow socket lock, too. > */ > if (msk->fastopening) > - err = __inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags, 1); > + err = __inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK, 1); > else > - err = inet_stream_connect(ssock, uaddr, addr_len, msk->connect_flags); > + err = inet_stream_connect(ssock, uaddr, addr_len, O_NONBLOCK); > inet_sk(sk)->defer_connect = inet_sk(ssock->sk)->defer_connect; > > /* on successful connect, the msk state will be moved to established by > @@ -3687,12 +3686,10 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) > > mptcp_copy_inaddrs(sk, ssock->sk); > > - /* unblocking connect, mptcp-level inet_stream_connect will error out > - * without changing the socket state, update it here. > + /* silence EINPROGRESS and let the caller inet_stream_connect > + * handle the connection in progress > */ > - if (err == -EINPROGRESS) > - sk->sk_socket->state = ssock->state; > - return err; > + return 0; > } > > static struct proto mptcp_prot = { > @@ -3751,18 +3748,6 @@ static int mptcp_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) > return err; > } > > -static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr, > - int addr_len, int flags) > -{ > - int ret; > - > - lock_sock(sock->sk); > - mptcp_sk(sock->sk)->connect_flags = flags; > - ret = __inet_stream_connect(sock, uaddr, addr_len, flags, 0); > - release_sock(sock->sk); > - return ret; > -} > - > static int mptcp_listen(struct socket *sock, int backlog) > { > struct mptcp_sock *msk = mptcp_sk(sock->sk); > @@ -3917,7 +3902,7 @@ static const struct proto_ops mptcp_stream_ops = { > .owner = THIS_MODULE, > .release = inet_release, > .bind = mptcp_bind, > - .connect = mptcp_stream_connect, > + .connect = inet_stream_connect, > .socketpair = sock_no_socketpair, > .accept = mptcp_stream_accept, > .getname = inet_getname, > @@ -4012,7 +3997,7 @@ static const struct proto_ops mptcp_v6_stream_ops = { > .owner = THIS_MODULE, > .release = inet6_release, > .bind = mptcp_bind, > - .connect = mptcp_stream_connect, > + .connect = inet_stream_connect, > .socketpair = sock_no_socketpair, > .accept = mptcp_stream_accept, > .getname = inet6_getname, > diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h > index 8d9d87e2f840..39613bba1c46 100644 > --- a/net/mptcp/protocol.h > +++ b/net/mptcp/protocol.h > @@ -300,7 +300,6 @@ struct mptcp_sock { > nodelay:1, > fastopening:1, > in_accept_queue:1; > - int connect_flags; > struct work_struct work; > struct sk_buff *ooo_last_skb; > struct rb_root out_of_order_queue; > -- > 2.40.0 > > >
On Fri, 2023-05-12 at 12:24 -0700, Mat Martineau wrote: > On Fri, 5 May 2023, Paolo Abeni wrote: > > > Ondrej reported a functional issue WRT timeout handling on connect > > with a nice reproducer. > > > > The problem is that the current mptcp connect waits for both the > > MPTCP socket level timeout, and the first subflow socket timeout. > > The latter is not influenced/touched by the exposed setsockopt(). > > > > Overall the above makes the SO_SNDTIMEO a no-op on connect. > > > > Since mptcp_connect is invoked via inet_stream_connect and the > > latter properly handle the MPTCP level timeout, we can address the > > issue making the nested subflow level connect always unblocking. > > > > This also allow simplifying a bit the code, dropping an ugly hack > > to handle the fastopen and custom proto_ops connect. > > > > The issues predates the blamed commit below, but the current resolution > > requires the infrastructure introduced there. > > > > Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()") > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > --- > > Note: there is needed follow-up correctly propagating the SYN_SENT > > -> CLOSE state transition from the first subflow to the msk socket in > > case of connect error, but that is somewhat independed/pre-existent. > > > > Sending out early for testing. > > Hi Paolo - > > Patch LGTM, and appreciate the cleanup of msk->connect_flags: > > Reviewed-by: Mat Martineau <martineau@kernel.org> > > I think it would be good to also add a selftest in net-next to check the > connect timeout (set to a very short time), but that doesn't need to block > this -net change. Agreed. Perhaps a pktdrill script would could be simpler option?!? Thanks! Paolo
Hi Paolo, Mat, On 15/05/2023 10:16, Paolo Abeni wrote: > On Fri, 2023-05-12 at 12:24 -0700, Mat Martineau wrote: >> On Fri, 5 May 2023, Paolo Abeni wrote: >> >>> Ondrej reported a functional issue WRT timeout handling on connect >>> with a nice reproducer. >>> >>> The problem is that the current mptcp connect waits for both the >>> MPTCP socket level timeout, and the first subflow socket timeout. >>> The latter is not influenced/touched by the exposed setsockopt(). >>> >>> Overall the above makes the SO_SNDTIMEO a no-op on connect. >>> >>> Since mptcp_connect is invoked via inet_stream_connect and the >>> latter properly handle the MPTCP level timeout, we can address the >>> issue making the nested subflow level connect always unblocking. >>> >>> This also allow simplifying a bit the code, dropping an ugly hack >>> to handle the fastopen and custom proto_ops connect. >>> >>> The issues predates the blamed commit below, but the current resolution >>> requires the infrastructure introduced there. >>> >>> Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()") >>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 >>> Signed-off-by: Paolo Abeni <pabeni@redhat.com> >>> --- >>> Note: there is needed follow-up correctly propagating the SYN_SENT >>> -> CLOSE state transition from the first subflow to the msk socket in >>> case of connect error, but that is somewhat independed/pre-existent. >>> >>> Sending out early for testing. >> >> Hi Paolo - >> >> Patch LGTM, and appreciate the cleanup of msk->connect_flags: >> >> Reviewed-by: Mat Martineau <martineau@kernel.org> >> >> I think it would be good to also add a selftest in net-next to check the >> connect timeout (set to a very short time), but that doesn't need to block >> this -net change. > > Agreed. Perhaps a pktdrill script would could be simpler option?!? Good idea, that seems more appropriated, especially with the timeout part. Just to avoid deadlocks: is there someone working on that? :) But as Mat said, this is not blocking, I can apply the patch. Cheers, Matt -- Tessares | Belgium | Hybrid Access Solutions www.tessares.net
On Mon, 2023-05-15 at 17:46 +0200, Matthieu Baerts wrote: > Hi Paolo, Mat, > > On 15/05/2023 10:16, Paolo Abeni wrote: > > On Fri, 2023-05-12 at 12:24 -0700, Mat Martineau wrote: > > > On Fri, 5 May 2023, Paolo Abeni wrote: > > > > > > > Ondrej reported a functional issue WRT timeout handling on connect > > > > with a nice reproducer. > > > > > > > > The problem is that the current mptcp connect waits for both the > > > > MPTCP socket level timeout, and the first subflow socket timeout. > > > > The latter is not influenced/touched by the exposed setsockopt(). > > > > > > > > Overall the above makes the SO_SNDTIMEO a no-op on connect. > > > > > > > > Since mptcp_connect is invoked via inet_stream_connect and the > > > > latter properly handle the MPTCP level timeout, we can address the > > > > issue making the nested subflow level connect always unblocking. > > > > > > > > This also allow simplifying a bit the code, dropping an ugly hack > > > > to handle the fastopen and custom proto_ops connect. > > > > > > > > The issues predates the blamed commit below, but the current resolution > > > > requires the infrastructure introduced there. > > > > > > > > Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()") > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > --- > > > > Note: there is needed follow-up correctly propagating the SYN_SENT > > > > -> CLOSE state transition from the first subflow to the msk socket in > > > > case of connect error, but that is somewhat independed/pre-existent. > > > > > > > > Sending out early for testing. > > > > > > Hi Paolo - > > > > > > Patch LGTM, and appreciate the cleanup of msk->connect_flags: > > > > > > Reviewed-by: Mat Martineau <martineau@kernel.org> > > > > > > I think it would be good to also add a selftest in net-next to check the > > > connect timeout (set to a very short time), but that doesn't need to block > > > this -net change. > > > > Agreed. Perhaps a pktdrill script would could be simpler option?!? > > Good idea, that seems more appropriated, especially with the timeout part. > > Just to avoid deadlocks: is there someone working on that? :) Yep, just opended a PR: https://github.com/multipath-tcp/packetdrill/pull/115 /P
On Fri, 2023-05-05 at 21:34 +0200, Paolo Abeni wrote: > Note: there is needed follow-up correctly propagating the SYN_SENT > -> CLOSE state transition from the first subflow to the msk socket in > case of connect error, but that is somewhat independed/pre-existent. Re-reading the relevant code, there is no need for such follow-up: commit 1249db44a102 ("mptcp: be careful on subflow status propagation on errors") already take care of it. Cheers, Paolo
© 2016 - 2023 Red Hat, Inc.