net/mptcp/protocol.c | 3 +-- net/mptcp/protocol.h | 1 + net/mptcp/subflow.c | 7 +++++++ 3 files changed, 9 insertions(+), 2 deletions(-)
The mptcp_pm_nl_get_local_id() code assumes that the msk local address
is available at that point. For passive sockets, we initialize such
address at accept() time.
Depending on the running configuration and the user-space timing, a
passive MPJ subflow can join the msk socket before accept() completes.
In such case, the PM assigns a wrong local id to the MPJ subflow
and later PM netlink operations will end-up touching the wrong/unexpected
subflow.
All the above causes sporadic self-tests failures, especially when
the host is heavy loaded.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
net/mptcp/protocol.c | 3 +--
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 7 +++++++
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 1aa940928b4f..d34765db0700 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2951,7 +2951,7 @@ static void mptcp_close(struct sock *sk, long timeout)
sock_put(sk);
}
-static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
+void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
{
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
const struct ipv6_pinfo *ssk6 = inet6_sk(ssk);
@@ -3702,7 +3702,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
if (mptcp_is_fully_established(newsk))
mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL);
- mptcp_copy_inaddrs(newsk, msk->first);
mptcp_rcv_space_init(msk, msk->first);
mptcp_propagate_sndbuf(newsk, msk->first);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 18f866b1afda..2358a4083eb3 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -604,6 +604,7 @@ int mptcp_allow_join_id0(const struct net *net);
unsigned int mptcp_stale_loss_cnt(const struct net *net);
int mptcp_get_pm_type(const struct net *net);
const char *mptcp_get_scheduler(const struct net *net);
+void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk);
void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
struct mptcp_options_received *mp_opt);
bool __mptcp_retransmit_pending_data(struct sock *sk);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 07dd23d0fe04..05a2b054287a 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -723,6 +723,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
goto dispose_child;
}
+ if (new_msk)
+ mptcp_copy_inaddrs(new_msk, child);
subflow_drop_ctx(child);
goto out;
}
@@ -750,6 +752,11 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
ctx->conn = new_msk;
new_msk = NULL;
+ /* set msk addresse early to ensure mptcp_pm_get_local_id()
+ * uses the correct data
+ */
+ mptcp_copy_inaddrs(ctx->conn, child);
+
/* with OoO packets we can reach here without ingress
* mpc option
*/
--
2.37.3
Hi Paolo, Mat, On 10/10/2022 21:56, Paolo Abeni wrote: > The mptcp_pm_nl_get_local_id() code assumes that the msk local address > is available at that point. For passive sockets, we initialize such > address at accept() time. > > Depending on the running configuration and the user-space timing, a > passive MPJ subflow can join the msk socket before accept() completes. > > In such case, the PM assigns a wrong local id to the MPJ subflow > and later PM netlink operations will end-up touching the wrong/unexpected > subflow. > > All the above causes sporadic self-tests failures, especially when > the host is heavy loaded. Thank you for the patch and the review! Now in our tree (fix for -net) with Mat's RvB tag, without a typo reported by 'checkpatch.pl --codespell'[1] and with a fix for the simple conflict reported by Mat: New patches for t/upstream-net: - 82e484041f04: mptcp: set msk local address earlier - Results: fc5b58f5783a..6cb6fb39b470 (export-net) New patches for t/upstream: - 82e484041f04: mptcp: set msk local address earlier - 345f2d3a78c6: conflict in t/mptcp-add-a-new-sysctl-scheduler - Results: 01e347f336e0..d99ebc8b27c4 (export) Tests are now in progress: https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20221012T130521 https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20221012T130521 Cheers, Matt [1] https://github.com/multipath-tcp/mptcp_net-next/actions/runs/3222012193 -- Tessares | Belgium | Hybrid Access Solutions www.tessares.net
On Mon, 10 Oct 2022, Paolo Abeni wrote:
> The mptcp_pm_nl_get_local_id() code assumes that the msk local address
> is available at that point. For passive sockets, we initialize such
> address at accept() time.
>
> Depending on the running configuration and the user-space timing, a
> passive MPJ subflow can join the msk socket before accept() completes.
>
> In such case, the PM assigns a wrong local id to the MPJ subflow
> and later PM netlink operations will end-up touching the wrong/unexpected
> subflow.
>
> All the above causes sporadic self-tests failures, especially when
> the host is heavy loaded.
>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308
> Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
> Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This has a minor conflict in protocol.h when trying to apply to
export-net, but it's easy to resolve (mptcp_get_scheduler is in export
but not export-net).
I ran in to one failure in a test that wasn't mentioned in #308 the first
time I tried running "mptcp_join.sh -rp" but couldn't reproduce it:
010 flush addresses syn[fail] got 2 JOIN[s] syn expected 3
- synack[fail] got 2 JOIN[s] synack expected 3
- ack[fail] got 2 JOIN[s] ack expected 3
add[ ok ] - echo [fail] got 2 ADD_ADDR echo[s] expected 3
rm [fail] got 2 RM_ADDR[s] expected 3
- rmsf [fail] got 2 RM_SUBFLOW[s] expected in range [3:6]
but I don't think that's related? (Let me know if the MIB list is useful,
I saved that too)
Code changes look good to me, seems like it's worth trying in the
export-net branch so we can test it with the CI builds:
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
> net/mptcp/protocol.c | 3 +--
> net/mptcp/protocol.h | 1 +
> net/mptcp/subflow.c | 7 +++++++
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 1aa940928b4f..d34765db0700 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2951,7 +2951,7 @@ static void mptcp_close(struct sock *sk, long timeout)
> sock_put(sk);
> }
>
> -static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
> +void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
> {
> #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> const struct ipv6_pinfo *ssk6 = inet6_sk(ssk);
> @@ -3702,7 +3702,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
> if (mptcp_is_fully_established(newsk))
> mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL);
>
> - mptcp_copy_inaddrs(newsk, msk->first);
> mptcp_rcv_space_init(msk, msk->first);
> mptcp_propagate_sndbuf(newsk, msk->first);
>
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 18f866b1afda..2358a4083eb3 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -604,6 +604,7 @@ int mptcp_allow_join_id0(const struct net *net);
> unsigned int mptcp_stale_loss_cnt(const struct net *net);
> int mptcp_get_pm_type(const struct net *net);
> const char *mptcp_get_scheduler(const struct net *net);
> +void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk);
> void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
> struct mptcp_options_received *mp_opt);
> bool __mptcp_retransmit_pending_data(struct sock *sk);
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 07dd23d0fe04..05a2b054287a 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -723,6 +723,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
> goto dispose_child;
> }
>
> + if (new_msk)
> + mptcp_copy_inaddrs(new_msk, child);
> subflow_drop_ctx(child);
> goto out;
> }
> @@ -750,6 +752,11 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
> ctx->conn = new_msk;
> new_msk = NULL;
>
> + /* set msk addresse early to ensure mptcp_pm_get_local_id()
> + * uses the correct data
> + */
> + mptcp_copy_inaddrs(ctx->conn, child);
> +
> /* with OoO packets we can reach here without ingress
> * mpc option
> */
> --
> 2.37.3
>
>
>
--
Mat Martineau
Intel
Hello,
On Mon, 2022-10-10 at 16:38 -0700, Mat Martineau wrote:
> On Mon, 10 Oct 2022, Paolo Abeni wrote:
>
> > The mptcp_pm_nl_get_local_id() code assumes that the msk local address
> > is available at that point. For passive sockets, we initialize such
> > address at accept() time.
> >
> > Depending on the running configuration and the user-space timing, a
> > passive MPJ subflow can join the msk socket before accept() completes.
> >
> > In such case, the PM assigns a wrong local id to the MPJ subflow
> > and later PM netlink operations will end-up touching the wrong/unexpected
> > subflow.
> >
> > All the above causes sporadic self-tests failures, especially when
> > the host is heavy loaded.
> >
> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308
> > Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
> > Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>
> This has a minor conflict in protocol.h when trying to apply to
> export-net, but it's easy to resolve (mptcp_get_scheduler is in export
> but not export-net).
>
> I ran in to one failure in a test that wasn't mentioned in #308 the first
> time I tried running "mptcp_join.sh -rp" but couldn't reproduce it:
>
> 010 flush addresses syn[fail] got 2 JOIN[s] syn expected 3
> - synack[fail] got 2 JOIN[s] synack expected 3
> - ack[fail] got 2 JOIN[s] ack expected 3
> add[ ok ] - echo [fail] got 2 ADD_ADDR echo[s] expected 3
> rm [fail] got 2 RM_ADDR[s] expected 3
> - rmsf [fail] got 2 RM_SUBFLOW[s] expected in range [3:6]
>
> but I don't think that's related?
That *looks* unrelated, and is possibly due to plain slowness
(mptcp_connect is unable to establish all the 3 subflows before the nl
pm starts deleting the addresses).
> (Let me know if the MIB list is useful,
> I saved that too)
The MIBs could possibly be useful, but I guess not decisive ;)
Thanks,
Paolo
© 2016 - 2026 Red Hat, Inc.