mptcp: Do not traverse the subflow connection list without lock

[PATCH mptcp-next v2] mptcp: Do not traverse the subflow connection list without lock

Posted by Mat Martineau 3 years, 9 months ago

The MPTCP socket's conn_list (list of subflows) requires the socket lock
to access. The MP_FAIL timeout code added such an access, where it would
check the list of subflows both in timer context and (later) in workqueue
context where the socket lock is held.

Rather than check the list twice, remove the check in the timeout
handler and only depend on the check in the workqueue. Also remove the
MPTCP_FAIL_NO_RESPONSE flag, since mptcp_mp_fail_no_response() has
insignificant overhead and can be checked on each worker run.

Reported-by: Paolo Abeni <pabeni@redhat.com>
Fixes: 49fa1919d6bc ("mptcp: reset subflow when MP_FAIL doesn't respond")
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---

v2: Remove flag.

---
 net/mptcp/protocol.c | 16 +---------------
 net/mptcp/protocol.h |  1 -
 2 files changed, 1 insertion(+), 16 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3c55e7f45aef..d6aef4b13b8a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2182,23 +2182,10 @@ mp_fail_response_expect_subflow(struct mptcp_sock *msk)
 	return ret;
 }
 
-static void mptcp_check_mp_fail_response(struct mptcp_sock *msk)
-{
-	struct mptcp_subflow_context *subflow;
-	struct sock *sk = (struct sock *)msk;
-
-	bh_lock_sock(sk);
-	subflow = mp_fail_response_expect_subflow(msk);
-	if (subflow)
-		__set_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags);
-	bh_unlock_sock(sk);
-}
-
 static void mptcp_timeout_timer(struct timer_list *t)
 {
 	struct sock *sk = from_timer(sk, t, sk_timer);
 
-	mptcp_check_mp_fail_response(mptcp_sk(sk));
 	mptcp_schedule_work(sk);
 	sock_put(sk);
 }
@@ -2575,8 +2562,7 @@ static void mptcp_worker(struct work_struct *work)
 	if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags))
 		__mptcp_retrans(sk);
 
-	if (test_and_clear_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags))
-		mptcp_mp_fail_no_response(msk);
+	mptcp_mp_fail_no_response(msk);
 
 unlock:
 	release_sock(sk);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 9933c711b787..9ccbe25dcc3e 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -117,7 +117,6 @@
 #define MPTCP_WORK_EOF		3
 #define MPTCP_FALLBACK_DONE	4
 #define MPTCP_WORK_CLOSE_SUBFLOW 5
-#define MPTCP_FAIL_NO_RESPONSE	6
 
 /* MPTCP socket release cb flags */
 #define MPTCP_PUSH_PENDING	1
-- 
2.36.1

Re: [PATCH mptcp-next v2] mptcp: Do not traverse the subflow connection list without lock

Posted by Matthieu Baerts 3 years, 8 months ago

Hi Mat, Paolo,

On 13/05/2022 02:35, Mat Martineau wrote:
> The MPTCP socket's conn_list (list of subflows) requires the socket lock
> to access. The MP_FAIL timeout code added such an access, where it would
> check the list of subflows both in timer context and (later) in workqueue
> context where the socket lock is held.
> 
> Rather than check the list twice, remove the check in the timeout
> handler and only depend on the check in the workqueue. Also remove the
> MPTCP_FAIL_NO_RESPONSE flag, since mptcp_mp_fail_no_response() has
> insignificant overhead and can be checked on each worker run.

Thank you for the patchs and the reviews!

Now in our tree (fixes for net-next) with Paolo's RvB tag:

New patches for t/upstream:

- 032e9e845d39: mptcp: Check for orphaned subflow before handling
MP_FAIL timer

- 8df40881ecfe: mptcp: Do not traverse the subflow connection list
without lock

- Results: 7e5581f27ee6..cd3c57373965 (export)

Builds and tests are now in progress:



https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20220516T162107

https://github.com/multipath-tcp/mptcp_net-next/actions/workflows/build-validation.yml?query=branch:export

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

Re: [PATCH mptcp-next v2] mptcp: Do not traverse the subflow connection list without lock

Posted by Paolo Abeni 3 years, 9 months ago

On Thu, 2022-05-12 at 17:35 -0700, Mat Martineau wrote:
> The MPTCP socket's conn_list (list of subflows) requires the socket lock
> to access. The MP_FAIL timeout code added such an access, where it would
> check the list of subflows both in timer context and (later) in workqueue
> context where the socket lock is held.
> 
> Rather than check the list twice, remove the check in the timeout
> handler and only depend on the check in the workqueue. Also remove the
> MPTCP_FAIL_NO_RESPONSE flag, since mptcp_mp_fail_no_response() has
> insignificant overhead and can be checked on each worker run.
> 
> Reported-by: Paolo Abeni <pabeni@redhat.com>
> Fixes: 49fa1919d6bc ("mptcp: reset subflow when MP_FAIL doesn't respond")
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
> 
> v2: Remove flag.
> 
> ---
>  net/mptcp/protocol.c | 16 +---------------
>  net/mptcp/protocol.h |  1 -
>  2 files changed, 1 insertion(+), 16 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 3c55e7f45aef..d6aef4b13b8a 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2182,23 +2182,10 @@ mp_fail_response_expect_subflow(struct mptcp_sock *msk)
>  	return ret;
>  }
>  
> -static void mptcp_check_mp_fail_response(struct mptcp_sock *msk)
> -{
> -	struct mptcp_subflow_context *subflow;
> -	struct sock *sk = (struct sock *)msk;
> -
> -	bh_lock_sock(sk);
> -	subflow = mp_fail_response_expect_subflow(msk);
> -	if (subflow)
> -		__set_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags);
> -	bh_unlock_sock(sk);
> -}
> -
>  static void mptcp_timeout_timer(struct timer_list *t)
>  {
>  	struct sock *sk = from_timer(sk, t, sk_timer);
>  
> -	mptcp_check_mp_fail_response(mptcp_sk(sk));
>  	mptcp_schedule_work(sk);
>  	sock_put(sk);
>  }
> @@ -2575,8 +2562,7 @@ static void mptcp_worker(struct work_struct *work)
>  	if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags))
>  		__mptcp_retrans(sk);
>  
> -	if (test_and_clear_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags))
> -		mptcp_mp_fail_no_response(msk);
> +	mptcp_mp_fail_no_response(msk);
>  
>  unlock:
>  	release_sock(sk);
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 9933c711b787..9ccbe25dcc3e 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -117,7 +117,6 @@
>  #define MPTCP_WORK_EOF		3
>  #define MPTCP_FALLBACK_DONE	4
>  #define MPTCP_WORK_CLOSE_SUBFLOW 5
> -#define MPTCP_FAIL_NO_RESPONSE	6
>  
>  /* MPTCP socket release cb flags */
>  #define MPTCP_PUSH_PENDING	1

LGTM, Thanks!

Reviewed-by: Paolo Abeni <pabeni@redhat.com>

BTW it looks like my recent patch made the infinite mappting test less
stable, but I still can't reproduce the issue locally :(

/P

Re: mptcp: Do not traverse the subflow connection list without lock: Tests Results

Posted by MPTCP CI 3 years, 9 months ago

Hi Mat,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal:
  - Unstable: 1 failed test(s): selftest_mptcp_join 🔴:
  - Task: https://cirrus-ci.com/task/4747222951657472
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/4747222951657472/summary/summary.txt

- KVM Validation: debug:
  - Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join 🔴:
  - Task: https://cirrus-ci.com/task/5873122858500096
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5873122858500096/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/5e566d71122d


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)