Move sk_net_refcnt_upgrade() outside the socket lock to avoid GFP_KERNEL
allocation under lock_sock_nested(), which can trigger page reclaim and
create a circular locking dependency detected by lockdep.
The circular dependency chain detected:
fs_reclaim --> sk_lock-AF_INET --> k-sk_lock-AF_INET/1
This operation is safe after lock release since the socket is fully
initialized and no other references to it exist yet.
Tested with mptcp/mptcp-upstream-virtme-docker CI: all 24 tests pass.
Fixes: 0cafd77dcd03 ("net: add a refcount tracker for kernel sockets")
Reported-by: syzbot+fb2c3fa2ba28aec94627@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/599
Signed-off-by: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
---
net/mptcp/subflow.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 8e386899ceb9..0dc8d32ef291 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1800,11 +1800,6 @@ int mptcp_subflow_create_socket(struct sock *sk, unsigned short family,
/* the newly created socket has to be in the same cgroup as its parent */
mptcp_attach_cgroup(sk, sf->sk);
- /* kernel sockets do not by default acquire net ref, but TCP timer
- * needs it.
- * Update ns_tracker to current stack trace and refcounted tracker.
- */
- sk_net_refcnt_upgrade(sf->sk);
err = tcp_set_ulp(sf->sk, "mptcp");
if (err)
goto err_free;
@@ -1812,6 +1807,11 @@ int mptcp_subflow_create_socket(struct sock *sk, unsigned short family,
mptcp_sockopt_sync_locked(mptcp_sk(sk), sf->sk);
release_sock(sf->sk);
+ /* Upgrade net namespace refcount after releasing the socket lock
+ * to avoid GFP_KERNEL allocation under lock (lockdep: fs_reclaim).
+ */
+ sk_net_refcnt_upgrade(sf->sk);
+
/* the newly created socket really belongs to the owning MPTCP
* socket, even if for additional subflows the allocation is performed
* by a kernel workqueue. Adjust inode references, so that the
--
2.43.0
Hi, Kalpan
On 6/17/2026 2:48 PM, Kalpan Jani wrote:
> Move sk_net_refcnt_upgrade() outside the socket lock to avoid GFP_KERNEL
> allocation under lock_sock_nested(), which can trigger page reclaim and
> create a circular locking dependency detected by lockdep.
>
Thanks for the patch.
I might be missing something, but I wonder if this is still needed after:
d532cddb6c60 ("nbd: Reclassify sockets to avoid lockdep circular dependency")
The reported chain seems to involve NBD sockets sharing the generic TCP lock
classes, and that patch should address it more directly.
> The circular dependency chain detected:
> fs_reclaim --> sk_lock-AF_INET --> k-sk_lock-AF_INET/1
>
> This operation is safe after lock release since the socket is fully
> initialized and no other references to it exist yet.
>
Even with this change, the locked section still contains GFP_KERNEL
allocations: tcp_set_ulp() is called under lock_sock_nested(), and the MPTCP
ULP init path allocates the subflow context with GFP_KERNEL.
So if the NBD dependency is still present, I am not sure this fixes the
reported lockdep dependency rather than moving it to another allocation site.
It might be worth retesting this report with the NBD reclassification patch in
the tree before changing the MPTCP locking here.
Thanks,
Li Xiasong
Hi Li, Kalpan,
On 23/06/2026 06:50, Li Xiasong wrote:
> Hi, Kalpan
>
> On 6/17/2026 2:48 PM, Kalpan Jani wrote:
>> Move sk_net_refcnt_upgrade() outside the socket lock to avoid GFP_KERNEL
>> allocation under lock_sock_nested(), which can trigger page reclaim and
>> create a circular locking dependency detected by lockdep.
>>
>
> Thanks for the patch.
>
> I might be missing something, but I wonder if this is still needed after:
>
> d532cddb6c60 ("nbd: Reclassify sockets to avoid lockdep circular dependency")
>
> The reported chain seems to involve NBD sockets sharing the generic TCP lock
> classes, and that patch should address it more directly.
@Li: Thank you for pointing to this commit. Indeed, that seems to be the
required fix for this issue.
@Kalpan: were you able to reproduce the issue pointed out by syzbot? If
yes, can you check if Eric's fix is enough? Or did you try to fix it by
analysing the code?
>> The circular dependency chain detected:
>> fs_reclaim --> sk_lock-AF_INET --> k-sk_lock-AF_INET/1
>>
>> This operation is safe after lock release since the socket is fully
>> initialized and no other references to it exist yet.
>>
>
> Even with this change, the locked section still contains GFP_KERNEL
> allocations: tcp_set_ulp() is called under lock_sock_nested(), and the MPTCP
> ULP init path allocates the subflow context with GFP_KERNEL.
>
> So if the NBD dependency is still present, I am not sure this fixes the
> reported lockdep dependency rather than moving it to another allocation site.
Indeed, the fix might not be enough to fix the issue.
> It might be worth retesting this report with the NBD reclassification patch in
> the tree before changing the MPTCP locking here.
If it is not possible to reproduce the bug, I suggest closing the
corresponding bug, and drop this patch. WDYT?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
Hi Matt, Li,
Thank you both for the review.
To be transparent: I did not attempt to reproduce the issue locally —
the patch was based purely on analysis of the lockdep report and the
allocation path through sk_net_refcnt_upgrade().
Li's point is well taken on two counts:
1. d532cddb6c60 ("nbd: Reclassify sockets to avoid lockdep circular
dependency") may already address the root cause by fixing the NBD
lock class confusion.
2. Even with my change, tcp_set_ulp() and the MPTCP ULP init path
still perform GFP_KERNEL allocations under lock_sock_nested(), so
the fix would be incomplete at best.
I will test against a tree containing d532cddb6c60 this week and check
whether the lockdep warning can still be triggered. If Eric's fix is
sufficient, I will close issue #599 and drop this patch.
I will report back with results.
Cheers,
Kalpan Jani
From: Matthieu Baerts <matttbe@kernel.org>
To: "Li Xiasong"<lixiasong1@huawei.com>, "Kalpan Jani"<kalpan.jani@mpiricsoftware.com>
Cc: <martineau@kernel.org>, <pabeni@redhat.com>, <shardul.b@mpiricsoftware.com>, <janak@mpiric.us>, <kalpanjani009@gmail.com>, <akshit@mpiricsoftware.com>, <syzbot+fb2c3fa2ba28aec94627@syzkaller.appspotmail.com>, "mptcp@lists.linux.dev"<mptcp@lists.linux.dev>, "zhangchangzhong"<zhangchangzhong@huawei.com>, "yuehaibing"<yuehaibing@huawei.com>, "weiyongjun (A)"<weiyongjun1@huawei.com>
Date: Tue, 23 Jun 2026 14:24:31 +0530
Subject: Re: [PATCH mptcp-next] mptcp: fix lockdep splat in mptcp_subflow_create_socket
> Hi Li, Kalpan,
>
> On 23/06/2026 06:50, Li Xiasong wrote:
> > Hi, Kalpan
> >
> > On 6/17/2026 2:48 PM, Kalpan Jani wrote:
> >> Move sk_net_refcnt_upgrade() outside the socket lock to avoid GFP_KERNEL
> >> allocation under lock_sock_nested(), which can trigger page reclaim and
> >> create a circular locking dependency detected by lockdep.
> >>
> >
> > Thanks for the patch.
> >
> > I might be missing something, but I wonder if this is still needed after:
> >
> > d532cddb6c60 ("nbd: Reclassify sockets to avoid lockdep circular dependency")
> >
> > The reported chain seems to involve NBD sockets sharing the generic TCP lock
> > classes, and that patch should address it more directly.
>
> @Li: Thank you for pointing to this commit. Indeed, that seems to be the
> required fix for this issue.
>
> @Kalpan: were you able to reproduce the issue pointed out by syzbot? If
> yes, can you check if Eric's fix is enough? Or did you try to fix it by
> analysing the code?
>
> >> The circular dependency chain detected:
> >> fs_reclaim --> sk_lock-AF_INET --> k-sk_lock-AF_INET/1
> >>
> >> This operation is safe after lock release since the socket is fully
> >> initialized and no other references to it exist yet.
> >>
> >
> > Even with this change, the locked section still contains GFP_KERNEL
> > allocations: tcp_set_ulp() is called under lock_sock_nested(), and the MPTCP
> > ULP init path allocates the subflow context with GFP_KERNEL.
> >
> > So if the NBD dependency is still present, I am not sure this fixes the
> > reported lockdep dependency rather than moving it to another allocation site.
>
> Indeed, the fix might not be enough to fix the issue.
>
> > It might be worth retesting this report with the NBD reclassification patch in
> > the tree before changing the MPTCP locking here.
> If it is not possible to reproduce the bug, I suggest closing the
> corresponding bug, and drop this patch. WDYT?
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
>
Hi Kalpan,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/27671843184
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/950229cf0bbc
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1112712
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
© 2016 - 2026 Red Hat, Inc.