[PATCH mptcp-next] mptcp: fix lockdep splat by using GFP_ATOMIC in sk_net_refcnt_upgrade

Kalpan Jani posted 1 patch 2 weeks, 4 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/20260616124801.2282199-1-kalpan.jani@mpiricsoftware.com
net/core/sock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH mptcp-next] mptcp: fix lockdep splat by using GFP_ATOMIC in sk_net_refcnt_upgrade
Posted by Kalpan Jani 2 weeks, 4 days ago
The function sk_net_refcnt_upgrade() is called while holding a socket lock
in mptcp_subflow_create_socket(). This function performs an allocation via
ref_tracker_alloc(gfp), which was using GFP_KERNEL.

GFP_KERNEL can trigger page reclaim, which creates a lockdep edge with
fs_reclaim, leading to a circular dependency when combined with other
lock chains from the block layer (nbd driver).

Change the allocation to use GFP_ATOMIC instead. This is safe because:
- The ref_tracker struct is small (~200 bytes)
- GFP_ATOMIC allocation failure is extremely unlikely
- GFP_ATOMIC is commonly used in other locked contexts

Tested with mptcp/mptcp-upstream-virtme-docker CI: all 24 tests pass.

Circular dependency chain detected:
  fs_reclaim --> sk_lock-AF_INET --> k-sk_lock-AF_INET/1

Fixes: 0cafd77dcd03 ("net: add a refcount tracker for kernel sockets")
Reported-by: syzbot+fb2c3fa2ba28aec94627@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/599
Signed-off-by: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
---
 net/core/sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 8a59bfaa8096..8458c304dabc 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2399,7 +2399,7 @@ void sk_net_refcnt_upgrade(struct sock *sk)
 	__netns_tracker_free(net, &sk->ns_tracker, false);
 	net_passive_dec(net);
 	sk->sk_net_refcnt = 1;
-	get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
+	get_net_track(net, &sk->ns_tracker, GFP_ATOMIC);
 	sock_inuse_add(net, 1);
 }
 EXPORT_SYMBOL_GPL(sk_net_refcnt_upgrade);
-- 
2.43.0
Re:[PATCH mptcp-next] mptcp: fix lockdep splat by using GFP_ATOMIC in sk_net_refcnt_upgrade
Posted by Kalpan Jani 2 weeks, 3 days ago
Please ignore the previous patch. I sent the wrong version (GFP_ATOMIC workaround).

This is a workaround solution, not the proper fix. I'm sending a proper fix that moves sk_net_refcnt_upgrade() outside the socket lock, which is the correct architectural approach.

Please wait for the next patch (Version 1: move code outside lock).

Thanks,
Kalpan Jani


From: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
To: <mptcp@lists.linux.dev>
Cc: <matthieu.baerts@tessares.net>, "Kalpan Jani"<kalpan.jani@mpiricsoftware.com>, <syzbot+fb2c3fa2ba28aec94627@syzkaller.appspotmail.com>
Date: Tue, 16 Jun 2026 18:18:01 +0530
Subject: [PATCH mptcp-next] mptcp: fix lockdep splat by using GFP_ATOMIC in sk_net_refcnt_upgrade

 > The function sk_net_refcnt_upgrade() is called while holding a socket lock
 > in mptcp_subflow_create_socket(). This function performs an allocation via
 > ref_tracker_alloc(gfp), which was using GFP_KERNEL.
 > 
 > GFP_KERNEL can trigger page reclaim, which creates a lockdep edge with
 > fs_reclaim, leading to a circular dependency when combined with other
 > lock chains from the block layer (nbd driver).
 > 
 > Change the allocation to use GFP_ATOMIC instead. This is safe because:
 > - The ref_tracker struct is small (~200 bytes)
 > - GFP_ATOMIC allocation failure is extremely unlikely
 > - GFP_ATOMIC is commonly used in other locked contexts
 > 
 > Tested with mptcp/mptcp-upstream-virtme-docker CI: all 24 tests pass.
 > 
 > Circular dependency chain detected:
 >   fs_reclaim --> sk_lock-AF_INET --> k-sk_lock-AF_INET/1
 > 
 > Fixes: 0cafd77dcd03 ("net: add a refcount tracker for kernel sockets")
 > Reported-by: syzbot+fb2c3fa2ba28aec94627@syzkaller.appspotmail.com
 > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/599
 > Signed-off-by: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
 > ---
 >  net/core/sock.c | 2 +-
 >  1 file changed, 1 insertion(+), 1 deletion(-)
 > 
 > diff --git a/net/core/sock.c b/net/core/sock.c
 > index 8a59bfaa8096..8458c304dabc 100644
 > --- a/net/core/sock.c
 > +++ b/net/core/sock.c
 > @@ -2399,7 +2399,7 @@ void sk_net_refcnt_upgrade(struct sock *sk)
 >      __netns_tracker_free(net, &sk->ns_tracker, false);
 >      net_passive_dec(net);
 >      sk->sk_net_refcnt = 1;
 > -    get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
 > +    get_net_track(net, &sk->ns_tracker, GFP_ATOMIC);
 >      sock_inuse_add(net, 1);
 >  }
 >  EXPORT_SYMBOL_GPL(sk_net_refcnt_upgrade);
 > -- 
 > 2.43.0
 > 
 >
Re: [PATCH mptcp-next] mptcp: fix lockdep splat by using GFP_ATOMIC in sk_net_refcnt_upgrade
Posted by MPTCP CI 2 weeks, 4 days ago
Hi Kalpan,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_add_addr ⚠️ 
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/27621862119

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/f101eea8485b
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1112313


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)