net/mptcp/protocol.c | 16 +--------------- net/mptcp/protocol.h | 1 - 2 files changed, 1 insertion(+), 16 deletions(-)
The MPTCP socket's conn_list (list of subflows) requires the socket lock
to access. The MP_FAIL timeout code added such an access, where it would
check the list of subflows both in timer context and (later) in workqueue
context where the socket lock is held.
Rather than check the list twice, remove the check in the timeout
handler and only depend on the check in the workqueue. Also remove the
MPTCP_FAIL_NO_RESPONSE flag, since mptcp_mp_fail_no_response() has
insignificant overhead and can be checked on each worker run.
Reported-by: Paolo Abeni <pabeni@redhat.com>
Fixes: 49fa1919d6bc ("mptcp: reset subflow when MP_FAIL doesn't respond")
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
v2: Remove flag.
---
net/mptcp/protocol.c | 16 +---------------
net/mptcp/protocol.h | 1 -
2 files changed, 1 insertion(+), 16 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3c55e7f45aef..d6aef4b13b8a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2182,23 +2182,10 @@ mp_fail_response_expect_subflow(struct mptcp_sock *msk)
return ret;
}
-static void mptcp_check_mp_fail_response(struct mptcp_sock *msk)
-{
- struct mptcp_subflow_context *subflow;
- struct sock *sk = (struct sock *)msk;
-
- bh_lock_sock(sk);
- subflow = mp_fail_response_expect_subflow(msk);
- if (subflow)
- __set_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags);
- bh_unlock_sock(sk);
-}
-
static void mptcp_timeout_timer(struct timer_list *t)
{
struct sock *sk = from_timer(sk, t, sk_timer);
- mptcp_check_mp_fail_response(mptcp_sk(sk));
mptcp_schedule_work(sk);
sock_put(sk);
}
@@ -2575,8 +2562,7 @@ static void mptcp_worker(struct work_struct *work)
if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags))
__mptcp_retrans(sk);
- if (test_and_clear_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags))
- mptcp_mp_fail_no_response(msk);
+ mptcp_mp_fail_no_response(msk);
unlock:
release_sock(sk);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 9933c711b787..9ccbe25dcc3e 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -117,7 +117,6 @@
#define MPTCP_WORK_EOF 3
#define MPTCP_FALLBACK_DONE 4
#define MPTCP_WORK_CLOSE_SUBFLOW 5
-#define MPTCP_FAIL_NO_RESPONSE 6
/* MPTCP socket release cb flags */
#define MPTCP_PUSH_PENDING 1
--
2.36.1
Hi Mat, Paolo, On 13/05/2022 02:35, Mat Martineau wrote: > The MPTCP socket's conn_list (list of subflows) requires the socket lock > to access. The MP_FAIL timeout code added such an access, where it would > check the list of subflows both in timer context and (later) in workqueue > context where the socket lock is held. > > Rather than check the list twice, remove the check in the timeout > handler and only depend on the check in the workqueue. Also remove the > MPTCP_FAIL_NO_RESPONSE flag, since mptcp_mp_fail_no_response() has > insignificant overhead and can be checked on each worker run. Thank you for the patchs and the reviews! Now in our tree (fixes for net-next) with Paolo's RvB tag: New patches for t/upstream: - 032e9e845d39: mptcp: Check for orphaned subflow before handling MP_FAIL timer - 8df40881ecfe: mptcp: Do not traverse the subflow connection list without lock - Results: 7e5581f27ee6..cd3c57373965 (export) Builds and tests are now in progress: https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20220516T162107 https://github.com/multipath-tcp/mptcp_net-next/actions/workflows/build-validation.yml?query=branch:export Cheers, Matt -- Tessares | Belgium | Hybrid Access Solutions www.tessares.net
On Thu, 2022-05-12 at 17:35 -0700, Mat Martineau wrote: > The MPTCP socket's conn_list (list of subflows) requires the socket lock > to access. The MP_FAIL timeout code added such an access, where it would > check the list of subflows both in timer context and (later) in workqueue > context where the socket lock is held. > > Rather than check the list twice, remove the check in the timeout > handler and only depend on the check in the workqueue. Also remove the > MPTCP_FAIL_NO_RESPONSE flag, since mptcp_mp_fail_no_response() has > insignificant overhead and can be checked on each worker run. > > Reported-by: Paolo Abeni <pabeni@redhat.com> > Fixes: 49fa1919d6bc ("mptcp: reset subflow when MP_FAIL doesn't respond") > Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> > --- > > v2: Remove flag. > > --- > net/mptcp/protocol.c | 16 +--------------- > net/mptcp/protocol.h | 1 - > 2 files changed, 1 insertion(+), 16 deletions(-) > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index 3c55e7f45aef..d6aef4b13b8a 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -2182,23 +2182,10 @@ mp_fail_response_expect_subflow(struct mptcp_sock *msk) > return ret; > } > > -static void mptcp_check_mp_fail_response(struct mptcp_sock *msk) > -{ > - struct mptcp_subflow_context *subflow; > - struct sock *sk = (struct sock *)msk; > - > - bh_lock_sock(sk); > - subflow = mp_fail_response_expect_subflow(msk); > - if (subflow) > - __set_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags); > - bh_unlock_sock(sk); > -} > - > static void mptcp_timeout_timer(struct timer_list *t) > { > struct sock *sk = from_timer(sk, t, sk_timer); > > - mptcp_check_mp_fail_response(mptcp_sk(sk)); > mptcp_schedule_work(sk); > sock_put(sk); > } > @@ -2575,8 +2562,7 @@ static void mptcp_worker(struct work_struct *work) > if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags)) > __mptcp_retrans(sk); > > - if (test_and_clear_bit(MPTCP_FAIL_NO_RESPONSE, &msk->flags)) > - mptcp_mp_fail_no_response(msk); > + mptcp_mp_fail_no_response(msk); > > unlock: > release_sock(sk); > diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h > index 9933c711b787..9ccbe25dcc3e 100644 > --- a/net/mptcp/protocol.h > +++ b/net/mptcp/protocol.h > @@ -117,7 +117,6 @@ > #define MPTCP_WORK_EOF 3 > #define MPTCP_FALLBACK_DONE 4 > #define MPTCP_WORK_CLOSE_SUBFLOW 5 > -#define MPTCP_FAIL_NO_RESPONSE 6 > > /* MPTCP socket release cb flags */ > #define MPTCP_PUSH_PENDING 1 LGTM, Thanks! Reviewed-by: Paolo Abeni <pabeni@redhat.com> BTW it looks like my recent patch made the infinite mappting test less stable, but I still can't reproduce the issue locally :( /P
Hi Mat, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: - Unstable: 1 failed test(s): selftest_mptcp_join 🔴: - Task: https://cirrus-ci.com/task/4747222951657472 - Summary: https://api.cirrus-ci.com/v1/artifact/task/4747222951657472/summary/summary.txt - KVM Validation: debug: - Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join 🔴: - Task: https://cirrus-ci.com/task/5873122858500096 - Summary: https://api.cirrus-ci.com/v1/artifact/task/5873122858500096/summary/summary.txt Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/5e566d71122d If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-debug For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (Tessares)
© 2016 - 2024 Red Hat, Inc.