net/mptcp/protocol.c | 3 +-- net/mptcp/protocol.h | 1 + net/mptcp/subflow.c | 7 +++++++ 3 files changed, 9 insertions(+), 2 deletions(-)
The mptcp_pm_nl_get_local_id() code assumes that the msk local address
is available at that point. For passive sockets, we initialize such
address at accept() time.
Depending on the running configuration and the user-space timing, a
passive MPJ subflow can join the msk socket before accept() completes.
In such case, the PM assigns a wrong local id to the MPJ subflow
and later PM netlink operations will end-up touching the wrong/unexpected
subflow.
All the above causes sporadic self-tests failures, especially when
the host is heavy loaded.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
net/mptcp/protocol.c | 3 +--
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 7 +++++++
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 1aa940928b4f..d34765db0700 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2951,7 +2951,7 @@ static void mptcp_close(struct sock *sk, long timeout)
sock_put(sk);
}
-static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
+void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
{
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
const struct ipv6_pinfo *ssk6 = inet6_sk(ssk);
@@ -3702,7 +3702,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
if (mptcp_is_fully_established(newsk))
mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL);
- mptcp_copy_inaddrs(newsk, msk->first);
mptcp_rcv_space_init(msk, msk->first);
mptcp_propagate_sndbuf(newsk, msk->first);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 18f866b1afda..2358a4083eb3 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -604,6 +604,7 @@ int mptcp_allow_join_id0(const struct net *net);
unsigned int mptcp_stale_loss_cnt(const struct net *net);
int mptcp_get_pm_type(const struct net *net);
const char *mptcp_get_scheduler(const struct net *net);
+void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk);
void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
struct mptcp_options_received *mp_opt);
bool __mptcp_retransmit_pending_data(struct sock *sk);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 07dd23d0fe04..05a2b054287a 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -723,6 +723,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
goto dispose_child;
}
+ if (new_msk)
+ mptcp_copy_inaddrs(new_msk, child);
subflow_drop_ctx(child);
goto out;
}
@@ -750,6 +752,11 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
ctx->conn = new_msk;
new_msk = NULL;
+ /* set msk addresse early to ensure mptcp_pm_get_local_id()
+ * uses the correct data
+ */
+ mptcp_copy_inaddrs(ctx->conn, child);
+
/* with OoO packets we can reach here without ingress
* mpc option
*/
--
2.37.3
Hi Paolo, Mat, On 10/10/2022 21:56, Paolo Abeni wrote: > The mptcp_pm_nl_get_local_id() code assumes that the msk local address > is available at that point. For passive sockets, we initialize such > address at accept() time. > > Depending on the running configuration and the user-space timing, a > passive MPJ subflow can join the msk socket before accept() completes. > > In such case, the PM assigns a wrong local id to the MPJ subflow > and later PM netlink operations will end-up touching the wrong/unexpected > subflow. > > All the above causes sporadic self-tests failures, especially when > the host is heavy loaded. Thank you for the patch and the review! Now in our tree (fix for -net) with Mat's RvB tag, without a typo reported by 'checkpatch.pl --codespell'[1] and with a fix for the simple conflict reported by Mat: New patches for t/upstream-net: - 82e484041f04: mptcp: set msk local address earlier - Results: fc5b58f5783a..6cb6fb39b470 (export-net) New patches for t/upstream: - 82e484041f04: mptcp: set msk local address earlier - 345f2d3a78c6: conflict in t/mptcp-add-a-new-sysctl-scheduler - Results: 01e347f336e0..d99ebc8b27c4 (export) Tests are now in progress: https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20221012T130521 https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20221012T130521 Cheers, Matt [1] https://github.com/multipath-tcp/mptcp_net-next/actions/runs/3222012193 -- Tessares | Belgium | Hybrid Access Solutions www.tessares.net
On Mon, 10 Oct 2022, Paolo Abeni wrote: > The mptcp_pm_nl_get_local_id() code assumes that the msk local address > is available at that point. For passive sockets, we initialize such > address at accept() time. > > Depending on the running configuration and the user-space timing, a > passive MPJ subflow can join the msk socket before accept() completes. > > In such case, the PM assigns a wrong local id to the MPJ subflow > and later PM netlink operations will end-up touching the wrong/unexpected > subflow. > > All the above causes sporadic self-tests failures, especially when > the host is heavy loaded. > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308 > Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") > Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints") > Signed-off-by: Paolo Abeni <pabeni@redhat.com> This has a minor conflict in protocol.h when trying to apply to export-net, but it's easy to resolve (mptcp_get_scheduler is in export but not export-net). I ran in to one failure in a test that wasn't mentioned in #308 the first time I tried running "mptcp_join.sh -rp" but couldn't reproduce it: 010 flush addresses syn[fail] got 2 JOIN[s] syn expected 3 - synack[fail] got 2 JOIN[s] synack expected 3 - ack[fail] got 2 JOIN[s] ack expected 3 add[ ok ] - echo [fail] got 2 ADD_ADDR echo[s] expected 3 rm [fail] got 2 RM_ADDR[s] expected 3 - rmsf [fail] got 2 RM_SUBFLOW[s] expected in range [3:6] but I don't think that's related? (Let me know if the MIB list is useful, I saved that too) Code changes look good to me, seems like it's worth trying in the export-net branch so we can test it with the CI builds: Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> > --- > net/mptcp/protocol.c | 3 +-- > net/mptcp/protocol.h | 1 + > net/mptcp/subflow.c | 7 +++++++ > 3 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index 1aa940928b4f..d34765db0700 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -2951,7 +2951,7 @@ static void mptcp_close(struct sock *sk, long timeout) > sock_put(sk); > } > > -static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk) > +void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk) > { > #if IS_ENABLED(CONFIG_MPTCP_IPV6) > const struct ipv6_pinfo *ssk6 = inet6_sk(ssk); > @@ -3702,7 +3702,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock, > if (mptcp_is_fully_established(newsk)) > mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL); > > - mptcp_copy_inaddrs(newsk, msk->first); > mptcp_rcv_space_init(msk, msk->first); > mptcp_propagate_sndbuf(newsk, msk->first); > > diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h > index 18f866b1afda..2358a4083eb3 100644 > --- a/net/mptcp/protocol.h > +++ b/net/mptcp/protocol.h > @@ -604,6 +604,7 @@ int mptcp_allow_join_id0(const struct net *net); > unsigned int mptcp_stale_loss_cnt(const struct net *net); > int mptcp_get_pm_type(const struct net *net); > const char *mptcp_get_scheduler(const struct net *net); > +void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk); > void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow, > struct mptcp_options_received *mp_opt); > bool __mptcp_retransmit_pending_data(struct sock *sk); > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c > index 07dd23d0fe04..05a2b054287a 100644 > --- a/net/mptcp/subflow.c > +++ b/net/mptcp/subflow.c > @@ -723,6 +723,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, > goto dispose_child; > } > > + if (new_msk) > + mptcp_copy_inaddrs(new_msk, child); > subflow_drop_ctx(child); > goto out; > } > @@ -750,6 +752,11 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, > ctx->conn = new_msk; > new_msk = NULL; > > + /* set msk addresse early to ensure mptcp_pm_get_local_id() > + * uses the correct data > + */ > + mptcp_copy_inaddrs(ctx->conn, child); > + > /* with OoO packets we can reach here without ingress > * mpc option > */ > -- > 2.37.3 > > > -- Mat Martineau Intel
Hello, On Mon, 2022-10-10 at 16:38 -0700, Mat Martineau wrote: > On Mon, 10 Oct 2022, Paolo Abeni wrote: > > > The mptcp_pm_nl_get_local_id() code assumes that the msk local address > > is available at that point. For passive sockets, we initialize such > > address at accept() time. > > > > Depending on the running configuration and the user-space timing, a > > passive MPJ subflow can join the msk socket before accept() completes. > > > > In such case, the PM assigns a wrong local id to the MPJ subflow > > and later PM netlink operations will end-up touching the wrong/unexpected > > subflow. > > > > All the above causes sporadic self-tests failures, especially when > > the host is heavy loaded. > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308 > > Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") > > Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints") > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > This has a minor conflict in protocol.h when trying to apply to > export-net, but it's easy to resolve (mptcp_get_scheduler is in export > but not export-net). > > I ran in to one failure in a test that wasn't mentioned in #308 the first > time I tried running "mptcp_join.sh -rp" but couldn't reproduce it: > > 010 flush addresses syn[fail] got 2 JOIN[s] syn expected 3 > - synack[fail] got 2 JOIN[s] synack expected 3 > - ack[fail] got 2 JOIN[s] ack expected 3 > add[ ok ] - echo [fail] got 2 ADD_ADDR echo[s] expected 3 > rm [fail] got 2 RM_ADDR[s] expected 3 > - rmsf [fail] got 2 RM_SUBFLOW[s] expected in range [3:6] > > but I don't think that's related? That *looks* unrelated, and is possibly due to plain slowness (mptcp_connect is unable to establish all the 3 subflows before the nl pm starts deleting the addresses). > (Let me know if the MIB list is useful, > I saved that too) The MIBs could possibly be useful, but I guess not decisive ;) Thanks, Paolo
© 2016 - 2024 Red Hat, Inc.