[PATCH net-next] mptcp: fix deadlock in mptcp_close()

Dan Carpenter posted 1 patch 1 year, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/YnOu5xlGgE2Ln7lj@kili
Maintainers: Matthieu Baerts <matthieu.baerts@tessares.net>, Mat Martineau <mathew.j.martineau@linux.intel.com>, "David S. Miller" <davem@davemloft.net>, Jakub Kicinski <kuba@kernel.org>, Eric Dumazet <edumazet@google.com>, Geliang Tang <geliang.tang@suse.com>, Paolo Abeni <pabeni@redhat.com>
net/mptcp/protocol.c | 2 --
1 file changed, 2 deletions(-)
[PATCH net-next] mptcp: fix deadlock in mptcp_close()
Posted by Dan Carpenter 1 year, 11 months ago
The mptcp_data_lock/unlock(sk) functions are taking the same spin lock
as the lock_sock()/release_sock() functions.  So we're already holding
the lock at this point and taking it again will lead to a deadlock.

Fixes: 4293248c6704 ("mptcp: add data lock for sk timers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
From static analysis.  Not tested.

 net/mptcp/protocol.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index a5d466e6b538..eef4673dae3a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2872,9 +2872,7 @@ static void mptcp_close(struct sock *sk, long timeout)
 		__mptcp_destroy_sock(sk);
 		do_cancel_work = true;
 	} else {
-		mptcp_data_lock(sk);
 		sk_reset_timer(sk, &sk->sk_timer, jiffies + TCP_TIMEWAIT_LEN);
-		mptcp_data_unlock(sk);
 	}
 	release_sock(sk);
 	if (do_cancel_work)
-- 
2.35.1


Re: mptcp: fix deadlock in mptcp_close(): Tests Results
Posted by MPTCP CI 1 year, 11 months ago
Hi Dan,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal:
  - Unstable: 1 failed test(s): selftest_mptcp_join 🔴:
  - Task: https://cirrus-ci.com/task/5570153533407232
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5570153533407232/summary/summary.txt

- KVM Validation: debug:
  - Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join 🔴:
  - Task: https://cirrus-ci.com/task/6696053440249856
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6696053440249856/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/6e6958f5c0e6


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
Re: [PATCH net-next] mptcp: fix deadlock in mptcp_close()
Posted by Paolo Abeni 1 year, 11 months ago
On Thu, 2022-05-05 at 14:03 +0300, Dan Carpenter wrote:
> The mptcp_data_lock/unlock(sk) functions are taking the same spin lock
> as the lock_sock()/release_sock() functions.  So we're already holding
> the lock at this point and taking it again will lead to a deadlock.

Note that lock_sock() (and release_sock()) releases the relevant
spinlock before completion. AFAICs the above deadlock is not possible.

Still I think we can revert commit 4293248c6704, I don't see why we
need the addtional spin lock ?!? 

Thanks!

Paolo


Re: [PATCH net-next] mptcp: fix deadlock in mptcp_close()
Posted by Dan Carpenter 1 year, 11 months ago
On Thu, May 05, 2022 at 01:50:01PM +0200, Paolo Abeni wrote:
> On Thu, 2022-05-05 at 14:03 +0300, Dan Carpenter wrote:
> > The mptcp_data_lock/unlock(sk) functions are taking the same spin lock
> > as the lock_sock()/release_sock() functions.  So we're already holding
> > the lock at this point and taking it again will lead to a deadlock.
> 
> Note that lock_sock() (and release_sock()) releases the relevant
> spinlock before completion. AFAICs the above deadlock is not possible.
> 

Oh.  Yeah.  You're right.  I had hard coded into my local copy of Smatch
that it took that lock.

	{"lock_sock_nested", LOCK,   spin_lock, 0, "&$->sk_lock.slock"},
	{"release_sock",     UNLOCK, spin_lock, 0, "&$->sk_lock.slock"},

But that's wrong...  It drops the lock as you say.

regards,
dan carpenter