[PATCH mptcp-net] mptcp: set msk local address earlier

Paolo Abeni posted 1 patch 1 year, 6 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/f5e8bfe9878d3948df9328870fc35daec6388e2b.1665431162.git.pabeni@redhat.com
Maintainers: Mat Martineau <mathew.j.martineau@linux.intel.com>, Matthieu Baerts <matthieu.baerts@tessares.net>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
net/mptcp/protocol.c | 3 +--
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c  | 7 +++++++
3 files changed, 9 insertions(+), 2 deletions(-)
[PATCH mptcp-net] mptcp: set msk local address earlier
Posted by Paolo Abeni 1 year, 6 months ago
The mptcp_pm_nl_get_local_id() code assumes that the msk local address
is available at that point. For passive sockets, we initialize such
address at accept() time.

Depending on the running configuration and the user-space timing, a
passive MPJ subflow can join the msk socket before accept() completes.

In such case, the PM assigns a wrong local id to the MPJ subflow
and later PM netlink operations will end-up touching the wrong/unexpected
subflow.

All the above causes sporadic self-tests failures, especially when
the host is heavy loaded.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 3 +--
 net/mptcp/protocol.h | 1 +
 net/mptcp/subflow.c  | 7 +++++++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 1aa940928b4f..d34765db0700 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2951,7 +2951,7 @@ static void mptcp_close(struct sock *sk, long timeout)
 	sock_put(sk);
 }
 
-static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
+void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
 {
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
 	const struct ipv6_pinfo *ssk6 = inet6_sk(ssk);
@@ -3702,7 +3702,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
 		if (mptcp_is_fully_established(newsk))
 			mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL);
 
-		mptcp_copy_inaddrs(newsk, msk->first);
 		mptcp_rcv_space_init(msk, msk->first);
 		mptcp_propagate_sndbuf(newsk, msk->first);
 
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 18f866b1afda..2358a4083eb3 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -604,6 +604,7 @@ int mptcp_allow_join_id0(const struct net *net);
 unsigned int mptcp_stale_loss_cnt(const struct net *net);
 int mptcp_get_pm_type(const struct net *net);
 const char *mptcp_get_scheduler(const struct net *net);
+void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk);
 void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
 				     struct mptcp_options_received *mp_opt);
 bool __mptcp_retransmit_pending_data(struct sock *sk);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 07dd23d0fe04..05a2b054287a 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -723,6 +723,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 				goto dispose_child;
 			}
 
+			if (new_msk)
+				mptcp_copy_inaddrs(new_msk, child);
 			subflow_drop_ctx(child);
 			goto out;
 		}
@@ -750,6 +752,11 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 			ctx->conn = new_msk;
 			new_msk = NULL;
 
+			/* set msk addresse early to ensure mptcp_pm_get_local_id()
+			 * uses the correct data
+			 */
+			mptcp_copy_inaddrs(ctx->conn, child);
+
 			/* with OoO packets we can reach here without ingress
 			 * mpc option
 			 */
-- 
2.37.3
Re: [PATCH mptcp-net] mptcp: set msk local address earlier
Posted by Matthieu Baerts 1 year, 6 months ago
Hi Paolo, Mat,

On 10/10/2022 21:56, Paolo Abeni wrote:
> The mptcp_pm_nl_get_local_id() code assumes that the msk local address
> is available at that point. For passive sockets, we initialize such
> address at accept() time.
> 
> Depending on the running configuration and the user-space timing, a
> passive MPJ subflow can join the msk socket before accept() completes.
> 
> In such case, the PM assigns a wrong local id to the MPJ subflow
> and later PM netlink operations will end-up touching the wrong/unexpected
> subflow.
> 
> All the above causes sporadic self-tests failures, especially when
> the host is heavy loaded.

Thank you for the patch and the review!

Now in our tree (fix for -net) with Mat's RvB tag, without a typo
reported by 'checkpatch.pl --codespell'[1] and with a fix for the simple
conflict reported by Mat:


New patches for t/upstream-net:
- 82e484041f04: mptcp: set msk local address earlier
- Results: fc5b58f5783a..6cb6fb39b470 (export-net)

New patches for t/upstream:
- 82e484041f04: mptcp: set msk local address earlier
- 345f2d3a78c6: conflict in t/mptcp-add-a-new-sysctl-scheduler
- Results: 01e347f336e0..d99ebc8b27c4 (export)


Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20221012T130521
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20221012T130521


Cheers,
Matt

[1] https://github.com/multipath-tcp/mptcp_net-next/actions/runs/3222012193
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net
Re: [PATCH mptcp-net] mptcp: set msk local address earlier
Posted by Mat Martineau 1 year, 6 months ago
On Mon, 10 Oct 2022, Paolo Abeni wrote:

> The mptcp_pm_nl_get_local_id() code assumes that the msk local address
> is available at that point. For passive sockets, we initialize such
> address at accept() time.
>
> Depending on the running configuration and the user-space timing, a
> passive MPJ subflow can join the msk socket before accept() completes.
>
> In such case, the PM assigns a wrong local id to the MPJ subflow
> and later PM netlink operations will end-up touching the wrong/unexpected
> subflow.
>
> All the above causes sporadic self-tests failures, especially when
> the host is heavy loaded.
>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308
> Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
> Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

This has a minor conflict in protocol.h when trying to apply to 
export-net, but it's easy to resolve (mptcp_get_scheduler is in export 
but not export-net).

I ran in to one failure in a test that wasn't mentioned in #308 the first 
time I tried running "mptcp_join.sh -rp" but couldn't reproduce it:

010 flush addresses                      syn[fail] got 2 JOIN[s] syn expected 3
  - synack[fail] got 2 JOIN[s] synack expected 3
  - ack[fail] got 2 JOIN[s] ack expected 3
                                          add[ ok ] - echo  [fail] got 2 ADD_ADDR echo[s] expected 3
                                          rm [fail] got 2 RM_ADDR[s] expected 3
  - rmsf  [fail] got 2 RM_SUBFLOW[s] expected in range [3:6]

but I don't think that's related? (Let me know if the MIB list is useful, 
I saved that too)

Code changes look good to me, seems like it's worth trying in the 
export-net branch so we can test it with the CI builds:

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>



> ---
> net/mptcp/protocol.c | 3 +--
> net/mptcp/protocol.h | 1 +
> net/mptcp/subflow.c  | 7 +++++++
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 1aa940928b4f..d34765db0700 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2951,7 +2951,7 @@ static void mptcp_close(struct sock *sk, long timeout)
> 	sock_put(sk);
> }
>
> -static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
> +void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
> {
> #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> 	const struct ipv6_pinfo *ssk6 = inet6_sk(ssk);
> @@ -3702,7 +3702,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
> 		if (mptcp_is_fully_established(newsk))
> 			mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL);
>
> -		mptcp_copy_inaddrs(newsk, msk->first);
> 		mptcp_rcv_space_init(msk, msk->first);
> 		mptcp_propagate_sndbuf(newsk, msk->first);
>
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 18f866b1afda..2358a4083eb3 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -604,6 +604,7 @@ int mptcp_allow_join_id0(const struct net *net);
> unsigned int mptcp_stale_loss_cnt(const struct net *net);
> int mptcp_get_pm_type(const struct net *net);
> const char *mptcp_get_scheduler(const struct net *net);
> +void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk);
> void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
> 				     struct mptcp_options_received *mp_opt);
> bool __mptcp_retransmit_pending_data(struct sock *sk);
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 07dd23d0fe04..05a2b054287a 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -723,6 +723,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
> 				goto dispose_child;
> 			}
>
> +			if (new_msk)
> +				mptcp_copy_inaddrs(new_msk, child);
> 			subflow_drop_ctx(child);
> 			goto out;
> 		}
> @@ -750,6 +752,11 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
> 			ctx->conn = new_msk;
> 			new_msk = NULL;
>
> +			/* set msk addresse early to ensure mptcp_pm_get_local_id()
> +			 * uses the correct data
> +			 */
> +			mptcp_copy_inaddrs(ctx->conn, child);
> +
> 			/* with OoO packets we can reach here without ingress
> 			 * mpc option
> 			 */
> -- 
> 2.37.3
>
>
>

--
Mat Martineau
Intel
Re: [PATCH mptcp-net] mptcp: set msk local address earlier
Posted by Paolo Abeni 1 year, 6 months ago
Hello,

On Mon, 2022-10-10 at 16:38 -0700, Mat Martineau wrote:
> On Mon, 10 Oct 2022, Paolo Abeni wrote:
> 
> > The mptcp_pm_nl_get_local_id() code assumes that the msk local address
> > is available at that point. For passive sockets, we initialize such
> > address at accept() time.
> > 
> > Depending on the running configuration and the user-space timing, a
> > passive MPJ subflow can join the msk socket before accept() completes.
> > 
> > In such case, the PM assigns a wrong local id to the MPJ subflow
> > and later PM netlink operations will end-up touching the wrong/unexpected
> > subflow.
> > 
> > All the above causes sporadic self-tests failures, especially when
> > the host is heavy loaded.
> > 
> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/308
> > Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
> > Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
> This has a minor conflict in protocol.h when trying to apply to 
> export-net, but it's easy to resolve (mptcp_get_scheduler is in export 
> but not export-net).
> 
> I ran in to one failure in a test that wasn't mentioned in #308 the first 
> time I tried running "mptcp_join.sh -rp" but couldn't reproduce it:
> 
> 010 flush addresses                      syn[fail] got 2 JOIN[s] syn expected 3
>   - synack[fail] got 2 JOIN[s] synack expected 3
>   - ack[fail] got 2 JOIN[s] ack expected 3
>                                           add[ ok ] - echo  [fail] got 2 ADD_ADDR echo[s] expected 3
>                                           rm [fail] got 2 RM_ADDR[s] expected 3
>   - rmsf  [fail] got 2 RM_SUBFLOW[s] expected in range [3:6]
> 
> but I don't think that's related? 

That *looks* unrelated, and is possibly due to plain slowness
(mptcp_connect is unable to establish all the 3 subflows before the nl
pm starts deleting the addresses).

> (Let me know if the MIB list is useful, 
> I saved that too)

The MIBs could possibly be useful, but I guess not decisive ;)

Thanks,

Paolo