net/smc/af_smc.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
Fixes deadlock described in this bug:
https://syzkaller.appspot.com/bug?extid=e953a8f3071f5c0a28fd.
Specific crash report here:
https://syzkaller.appspot.com/text?tag=CrashReport&x=14670e07980000.
This bug is a false positive lockdep warning since gtp and smc use
completely different socket protocols.
Lockdep thinks that lock_sock() in smc will deadlock with gtp's
lock_sock() acquisition. Adding a function that initializes lockdep
labels for smc socks resolved the false positives in lockdep upon
testing. Since smc uses AF_SMC and SOCKSTREAM, two labels are created to
distinguish between proper smc socks and non smc socks incorrectly
input into the function.
Signed-off-by: Daniel Yang <danielyangkang@gmail.com>
Reported-by: syzbot+e953a8f3071f5c0a28fd@syzkaller.appspotmail.com
---
v1->v2: Add lockdep annotations instead of changing locking order
net/smc/af_smc.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 0316217b7..4de70bfd5 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -16,6 +16,8 @@
* based on prototype from Frank Blaschka
*/
+#include "linux/lockdep_types.h"
+#include "linux/socket.h"
#define KMSG_COMPONENT "smc"
#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
@@ -2755,6 +2757,24 @@ int smc_getname(struct socket *sock, struct sockaddr *addr,
return smc->clcsock->ops->getname(smc->clcsock, addr, peer);
}
+static struct lock_class_key smc_slock_key[2];
+static struct lock_class_key smc_key[2];
+
+static inline void smc_sock_lock_init(struct sock *sk)
+{
+ bool is_smc = (sk->sk_family == AF_SMC) && sk_is_tcp(sk);
+
+ sock_lock_init_class_and_name(sk,
+ is_smc ?
+ "smc_lock-AF_SMC_SOCKSTREAM" :
+ "smc_lock-INVALID",
+ &smc_slock_key[is_smc],
+ is_smc ?
+ "smc_sk_lock-AF_SMC_SOCKSTREAM" :
+ "smc_sk_lock-INVALID",
+ &smc_key[is_smc]);
+}
+
int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
{
struct sock *sk = sock->sk;
@@ -2762,6 +2782,7 @@ int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
int rc;
smc = smc_sk(sk);
+ smc_sock_lock_init(sk);
lock_sock(sk);
/* SMC does not support connect with fastopen */
--
2.39.2
Hi Daniel, kernel test robot noticed the following build warnings: [auto build test WARNING on linus/master] [also build test WARNING on v6.12-rc2 next-20241004] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Daniel-Yang/resolve-gtp-possible-deadlock-warning/20241005-125510 base: linus/master patch link: https://lore.kernel.org/r/20241005045411.118720-1-danielyangkang%40gmail.com patch subject: [PATCH v2] resolve gtp possible deadlock warning config: x86_64-buildonly-randconfig-005-20241007 (https://download.01.org/0day-ci/archive/20241007/202410072002.hT2D7135-lkp@intel.com/config) compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241007/202410072002.hT2D7135-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202410072002.hT2D7135-lkp@intel.com/ All warnings (new ones prefixed by >>): >> net/smc/af_smc.c:22:9: warning: 'pr_fmt' macro redefined [-Wmacro-redefined] 22 | #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt | ^ include/linux/printk.h:380:9: note: previous definition is here 380 | #define pr_fmt(fmt) fmt | ^ 1 warning generated. vim +/pr_fmt +22 net/smc/af_smc.c ac7138746e1413 Ursula Braun 2017-01-09 @22 #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt ac7138746e1413 Ursula Braun 2017-01-09 23 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Hi Daniel, kernel test robot noticed the following build warnings: [auto build test WARNING on linus/master] [also build test WARNING on v6.12-rc2 next-20241004] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Daniel-Yang/resolve-gtp-possible-deadlock-warning/20241005-125510 base: linus/master patch link: https://lore.kernel.org/r/20241005045411.118720-1-danielyangkang%40gmail.com patch subject: [PATCH v2] resolve gtp possible deadlock warning config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20241007/202410071937.ikrY4umF-lkp@intel.com/config) compiler: m68k-linux-gcc (GCC) 14.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241007/202410071937.ikrY4umF-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202410071937.ikrY4umF-lkp@intel.com/ All warnings (new ones prefixed by >>): >> net/smc/af_smc.c:22:9: warning: "pr_fmt" redefined 22 | #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt | ^~~~~~ In file included from include/linux/kernel.h:31, from include/linux/uio.h:8, from include/linux/socket.h:8, from net/smc/af_smc.c:20: include/linux/printk.h:380:9: note: this is the location of the previous definition 380 | #define pr_fmt(fmt) fmt | ^~~~~~ Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for GET_FREE_REGION Depends on [n]: SPARSEMEM [=n] Selected by [m]: - RESOURCE_KUNIT_TEST [=m] && RUNTIME_TESTING_MENU [=y] && KUNIT [=m] vim +/pr_fmt +22 net/smc/af_smc.c ac7138746e1413 Ursula Braun 2017-01-09 @22 #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt ac7138746e1413 Ursula Braun 2017-01-09 23 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
On Sat, Oct 5, 2024 at 6:54 AM Daniel Yang <danielyangkang@gmail.com> wrote: > > Fixes deadlock described in this bug: > https://syzkaller.appspot.com/bug?extid=e953a8f3071f5c0a28fd. > Specific crash report here: > https://syzkaller.appspot.com/text?tag=CrashReport&x=14670e07980000. > > This bug is a false positive lockdep warning since gtp and smc use > completely different socket protocols. > > Lockdep thinks that lock_sock() in smc will deadlock with gtp's > lock_sock() acquisition. Adding a function that initializes lockdep > labels for smc socks resolved the false positives in lockdep upon > testing. Since smc uses AF_SMC and SOCKSTREAM, two labels are created to > distinguish between proper smc socks and non smc socks incorrectly > input into the function. > > Signed-off-by: Daniel Yang <danielyangkang@gmail.com> > Reported-by: syzbot+e953a8f3071f5c0a28fd@syzkaller.appspotmail.com > --- > v1->v2: Add lockdep annotations instead of changing locking order > net/smc/af_smc.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > index 0316217b7..4de70bfd5 100644 > --- a/net/smc/af_smc.c > +++ b/net/smc/af_smc.c > @@ -16,6 +16,8 @@ > * based on prototype from Frank Blaschka > */ > > +#include "linux/lockdep_types.h" > +#include "linux/socket.h" > #define KMSG_COMPONENT "smc" > #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt > > @@ -2755,6 +2757,24 @@ int smc_getname(struct socket *sock, struct sockaddr *addr, > return smc->clcsock->ops->getname(smc->clcsock, addr, peer); > } > > +static struct lock_class_key smc_slock_key[2]; > +static struct lock_class_key smc_key[2]; > + > +static inline void smc_sock_lock_init(struct sock *sk) > +{ > + bool is_smc = (sk->sk_family == AF_SMC) && sk_is_tcp(sk); > + > + sock_lock_init_class_and_name(sk, > + is_smc ? > + "smc_lock-AF_SMC_SOCKSTREAM" : > + "smc_lock-INVALID", > + &smc_slock_key[is_smc], > + is_smc ? > + "smc_sk_lock-AF_SMC_SOCKSTREAM" : > + "smc_sk_lock-INVALID", > + &smc_key[is_smc]); > +} > + > int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) > { > struct sock *sk = sock->sk; > @@ -2762,6 +2782,7 @@ int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) > int rc; > > smc = smc_sk(sk); > + smc_sock_lock_init(sk); > lock_sock(sk); > > /* SMC does not support connect with fastopen */ > -- > 2.39.2 > sock_lock_init_class_and_name() is not meant to be repeatedly called, from sendmsg() Find a way to do this once, perhaps in smc_create_clcsk(), but I will let SMC experts chime in.
On Sat, Oct 5, 2024 at 12:25 AM Eric Dumazet <edumazet@google.com> wrote: > > On Sat, Oct 5, 2024 at 6:54 AM Daniel Yang <danielyangkang@gmail.com> wrote: > > > > Fixes deadlock described in this bug: > > https://syzkaller.appspot.com/bug?extid=e953a8f3071f5c0a28fd. > > Specific crash report here: > > https://syzkaller.appspot.com/text?tag=CrashReport&x=14670e07980000. > > > > This bug is a false positive lockdep warning since gtp and smc use > > completely different socket protocols. > > > > Lockdep thinks that lock_sock() in smc will deadlock with gtp's > > lock_sock() acquisition. Adding a function that initializes lockdep > > labels for smc socks resolved the false positives in lockdep upon > > testing. Since smc uses AF_SMC and SOCKSTREAM, two labels are created to > > distinguish between proper smc socks and non smc socks incorrectly > > input into the function. > > > > Signed-off-by: Daniel Yang <danielyangkang@gmail.com> > > Reported-by: syzbot+e953a8f3071f5c0a28fd@syzkaller.appspotmail.com > > --- > > v1->v2: Add lockdep annotations instead of changing locking order > > net/smc/af_smc.c | 21 +++++++++++++++++++++ > > 1 file changed, 21 insertions(+) > > > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > > index 0316217b7..4de70bfd5 100644 > > --- a/net/smc/af_smc.c > > +++ b/net/smc/af_smc.c > > @@ -16,6 +16,8 @@ > > * based on prototype from Frank Blaschka > > */ > > > > +#include "linux/lockdep_types.h" > > +#include "linux/socket.h" > > #define KMSG_COMPONENT "smc" > > #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt > > > > @@ -2755,6 +2757,24 @@ int smc_getname(struct socket *sock, struct sockaddr *addr, > > return smc->clcsock->ops->getname(smc->clcsock, addr, peer); > > } > > > > +static struct lock_class_key smc_slock_key[2]; > > +static struct lock_class_key smc_key[2]; > > + > > +static inline void smc_sock_lock_init(struct sock *sk) > > +{ > > + bool is_smc = (sk->sk_family == AF_SMC) && sk_is_tcp(sk); > > + > > + sock_lock_init_class_and_name(sk, > > + is_smc ? > > + "smc_lock-AF_SMC_SOCKSTREAM" : > > + "smc_lock-INVALID", > > + &smc_slock_key[is_smc], > > + is_smc ? > > + "smc_sk_lock-AF_SMC_SOCKSTREAM" : > > + "smc_sk_lock-INVALID", > > + &smc_key[is_smc]); > > +} > > + > > int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) > > { > > struct sock *sk = sock->sk; > > @@ -2762,6 +2782,7 @@ int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) > > int rc; > > > > smc = smc_sk(sk); > > + smc_sock_lock_init(sk); > > lock_sock(sk); > > > > /* SMC does not support connect with fastopen */ > > -- > > 2.39.2 > > > > sock_lock_init_class_and_name() is not meant to be repeatedly called, > from sendmsg() > > Find a way to do this once, perhaps in smc_create_clcsk(), but I will > let SMC experts chime in. So I tried putting the lockdep annotations in smc_create_clcsk() as well as smc_sock_alloc() and they both fail to remove the false positive but putting the annotations in smc_sendmsg() gets rid of them. I put some print statements in the functions to see the addresses of the socks. [ 78.121827][ T8326] smc: smc_create_clcsk clcsk_addr: ffffc90007f0fd20 [ 78.122436][ T8326] smc: sendmsg sk_addr: ffffc90007f0fa88 [ 78.126907][ T8326] smc: __smc_create input_param clcsock: 0000000000000000 [ 78.134395][ T8326] smc: smc_sock_alloc sk_addr: ffffc90007f0fd70 [ 78.136679][ T8326] smc: smc_create_clcsk clcsk_clcsk: ffffc90007f0fd70 It appears that none of the smc allocation methods are called, so where else exactly could the sock used in sendmsg be created?
On 10/7/24 2:54 PM, Daniel Yang wrote: > On Sat, Oct 5, 2024 at 12:25 AM Eric Dumazet <edumazet@google.com> wrote: >> >> On Sat, Oct 5, 2024 at 6:54 AM Daniel Yang <danielyangkang@gmail.com> wrote: >>> >>> Fixes deadlock described in this bug: >>> https://syzkaller.appspot.com/bug?extid=e953a8f3071f5c0a28fd. >>> Specific crash report here: >>> https://syzkaller.appspot.com/text?tag=CrashReport&x=14670e07980000. >>> >>> This bug is a false positive lockdep warning since gtp and smc use >>> completely different socket protocols. >>> >>> Lockdep thinks that lock_sock() in smc will deadlock with gtp's >>> lock_sock() acquisition. Adding a function that initializes lockdep >>> labels for smc socks resolved the false positives in lockdep upon >>> testing. Since smc uses AF_SMC and SOCKSTREAM, two labels are created to >>> distinguish between proper smc socks and non smc socks incorrectly >>> input into the function. >>> >>> Signed-off-by: Daniel Yang <danielyangkang@gmail.com> >>> Reported-by: syzbot+e953a8f3071f5c0a28fd@syzkaller.appspotmail.com >>> --- >>> v1->v2: Add lockdep annotations instead of changing locking order >>> net/smc/af_smc.c | 21 +++++++++++++++++++++ >>> 1 file changed, 21 insertions(+) >>> >>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c >>> index 0316217b7..4de70bfd5 100644 >>> --- a/net/smc/af_smc.c >>> +++ b/net/smc/af_smc.c >>> @@ -16,6 +16,8 @@ >>> * based on prototype from Frank Blaschka >>> */ >>> >>> +#include "linux/lockdep_types.h" >>> +#include "linux/socket.h" >>> #define KMSG_COMPONENT "smc" >>> #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt >>> >>> @@ -2755,6 +2757,24 @@ int smc_getname(struct socket *sock, struct sockaddr *addr, >>> return smc->clcsock->ops->getname(smc->clcsock, addr, peer); >>> } >>> >>> +static struct lock_class_key smc_slock_key[2]; >>> +static struct lock_class_key smc_key[2]; >>> + >>> +static inline void smc_sock_lock_init(struct sock *sk) >>> +{ >>> + bool is_smc = (sk->sk_family == AF_SMC) && sk_is_tcp(sk); >>> + >>> + sock_lock_init_class_and_name(sk, >>> + is_smc ? >>> + "smc_lock-AF_SMC_SOCKSTREAM" : >>> + "smc_lock-INVALID", >>> + &smc_slock_key[is_smc], >>> + is_smc ? >>> + "smc_sk_lock-AF_SMC_SOCKSTREAM" : >>> + "smc_sk_lock-INVALID", >>> + &smc_key[is_smc]); >>> +} >>> + >>> int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) >>> { >>> struct sock *sk = sock->sk; >>> @@ -2762,6 +2782,7 @@ int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) >>> int rc; >>> >>> smc = smc_sk(sk); >>> + smc_sock_lock_init(sk); >>> lock_sock(sk); >>> >>> /* SMC does not support connect with fastopen */ >>> -- >>> 2.39.2 >>> >> >> sock_lock_init_class_and_name() is not meant to be repeatedly called, >> from sendmsg() >> >> Find a way to do this once, perhaps in smc_create_clcsk(), but I will >> let SMC experts chime in. > > So I tried putting the lockdep annotations in smc_create_clcsk() as > well as smc_sock_alloc() and they both fail to remove the false > positive but putting the annotations in smc_sendmsg() gets rid of > them. I put some print statements in the functions to see the > addresses of the socks. > > [ 78.121827][ T8326] smc: smc_create_clcsk clcsk_addr: ffffc90007f0fd20 > [ 78.122436][ T8326] smc: sendmsg sk_addr: ffffc90007f0fa88 > [ 78.126907][ T8326] smc: __smc_create input_param clcsock: 0000000000000000 > [ 78.134395][ T8326] smc: smc_sock_alloc sk_addr: ffffc90007f0fd70 > [ 78.136679][ T8326] smc: smc_create_clcsk clcsk_clcsk: ffffc90007f0fd70 > > It appears that none of the smc allocation methods are called, so > where else exactly could the sock used in sendmsg be created? I think the problem you described can be solved through https://lore.kernel.org/netdev/20240912000446.1025844-1-xiyou.wangcong@gmail.com/, but Cong Wang seems to have given up on following up at the moment. If you are interested, you can try take on this problem. Additionally, if you want to make sock_lock_init_class_and_name as a solution, the correct approach might be (But I do not recommend doing so. I still hope to maintain consistency between IPPROTO_SMC and other inet implementations as much as possible.) +static struct lock_class_key smc_slock_keys[2]; +static struct lock_class_key smc_keys[2]; + static int smc_inet_init_sock(struct sock *sk) { struct net *net = sock_net(sk); + int rc; /* init common smc sock */ smc_sk_init(net, sk, IPPROTO_SMC); /* create clcsock */ - return smc_create_clcsk(net, sk, sk->sk_family); + rc = smc_create_clcsk(net, sk, sk->sk_family); + if (rc) + return rc; + + switch (sk->sk_family) { + case AF_INET: + sock_lock_init_class_and_name(sk, "slock-AF_INET-SMC", + &smc_slock_keys[0], + "sk_lock-AF_INET-SMC", + &smc_keys[0]); + break; + case AF_INET6: + sock_lock_init_class_and_name(sk, "slock-AF_INET6-SMC", + &smc_slock_keys[1], + "sk_lock-AF_INET6-SMC", + &smc_keys[1]); + break; + default: + WARN_ON_ONCE(1); + } + + return 0; }
On Wed, Oct 9, 2024 at 12:20 AM D. Wythe <alibuda@linux.alibaba.com> wrote: > > > > On 10/7/24 2:54 PM, Daniel Yang wrote: > > On Sat, Oct 5, 2024 at 12:25 AM Eric Dumazet <edumazet@google.com> wrote: > >> > >> On Sat, Oct 5, 2024 at 6:54 AM Daniel Yang <danielyangkang@gmail.com> wrote: > >>> > >>> Fixes deadlock described in this bug: > >>> https://syzkaller.appspot.com/bug?extid=e953a8f3071f5c0a28fd. > >>> Specific crash report here: > >>> https://syzkaller.appspot.com/text?tag=CrashReport&x=14670e07980000. > >>> > >>> This bug is a false positive lockdep warning since gtp and smc use > >>> completely different socket protocols. > >>> > >>> Lockdep thinks that lock_sock() in smc will deadlock with gtp's > >>> lock_sock() acquisition. Adding a function that initializes lockdep > >>> labels for smc socks resolved the false positives in lockdep upon > >>> testing. Since smc uses AF_SMC and SOCKSTREAM, two labels are created to > >>> distinguish between proper smc socks and non smc socks incorrectly > >>> input into the function. > >>> > >>> Signed-off-by: Daniel Yang <danielyangkang@gmail.com> > >>> Reported-by: syzbot+e953a8f3071f5c0a28fd@syzkaller.appspotmail.com > >>> --- > >>> v1->v2: Add lockdep annotations instead of changing locking order > >>> net/smc/af_smc.c | 21 +++++++++++++++++++++ > >>> 1 file changed, 21 insertions(+) > >>> > >>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > >>> index 0316217b7..4de70bfd5 100644 > >>> --- a/net/smc/af_smc.c > >>> +++ b/net/smc/af_smc.c > >>> @@ -16,6 +16,8 @@ > >>> * based on prototype from Frank Blaschka > >>> */ > >>> > >>> +#include "linux/lockdep_types.h" > >>> +#include "linux/socket.h" > >>> #define KMSG_COMPONENT "smc" > >>> #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt > >>> > >>> @@ -2755,6 +2757,24 @@ int smc_getname(struct socket *sock, struct sockaddr *addr, > >>> return smc->clcsock->ops->getname(smc->clcsock, addr, peer); > >>> } > >>> > >>> +static struct lock_class_key smc_slock_key[2]; > >>> +static struct lock_class_key smc_key[2]; > >>> + > >>> +static inline void smc_sock_lock_init(struct sock *sk) > >>> +{ > >>> + bool is_smc = (sk->sk_family == AF_SMC) && sk_is_tcp(sk); > >>> + > >>> + sock_lock_init_class_and_name(sk, > >>> + is_smc ? > >>> + "smc_lock-AF_SMC_SOCKSTREAM" : > >>> + "smc_lock-INVALID", > >>> + &smc_slock_key[is_smc], > >>> + is_smc ? > >>> + "smc_sk_lock-AF_SMC_SOCKSTREAM" : > >>> + "smc_sk_lock-INVALID", > >>> + &smc_key[is_smc]); > >>> +} > >>> + > >>> int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) > >>> { > >>> struct sock *sk = sock->sk; > >>> @@ -2762,6 +2782,7 @@ int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) > >>> int rc; > >>> > >>> smc = smc_sk(sk); > >>> + smc_sock_lock_init(sk); > >>> lock_sock(sk); > >>> > >>> /* SMC does not support connect with fastopen */ > >>> -- > >>> 2.39.2 > >>> > >> > >> sock_lock_init_class_and_name() is not meant to be repeatedly called, > >> from sendmsg() > >> > >> Find a way to do this once, perhaps in smc_create_clcsk(), but I will > >> let SMC experts chime in. > > > > So I tried putting the lockdep annotations in smc_create_clcsk() as > > well as smc_sock_alloc() and they both fail to remove the false > > positive but putting the annotations in smc_sendmsg() gets rid of > > them. I put some print statements in the functions to see the > > addresses of the socks. > > > > [ 78.121827][ T8326] smc: smc_create_clcsk clcsk_addr: ffffc90007f0fd20 > > [ 78.122436][ T8326] smc: sendmsg sk_addr: ffffc90007f0fa88 > > [ 78.126907][ T8326] smc: __smc_create input_param clcsock: 0000000000000000 > > [ 78.134395][ T8326] smc: smc_sock_alloc sk_addr: ffffc90007f0fd70 > > [ 78.136679][ T8326] smc: smc_create_clcsk clcsk_clcsk: ffffc90007f0fd70 > > > > It appears that none of the smc allocation methods are called, so > > where else exactly could the sock used in sendmsg be created? > > > I think the problem you described can be solved through > https://lore.kernel.org/netdev/20240912000446.1025844-1-xiyou.wangcong@gmail.com/, but Cong Wang > seems to have given up on following up at the moment. If you are interested, you can try take on > this problem. > > > Additionally, if you want to make sock_lock_init_class_and_name as a solution, the correct approach > might be (But I do not recommend doing so. I still hope to maintain consistency between IPPROTO_SMC > and other inet implementations as much as possible.) > > > +static struct lock_class_key smc_slock_keys[2]; > +static struct lock_class_key smc_keys[2]; > + > static int smc_inet_init_sock(struct sock *sk) > { > struct net *net = sock_net(sk); > + int rc; > > /* init common smc sock */ > smc_sk_init(net, sk, IPPROTO_SMC); > /* create clcsock */ > - return smc_create_clcsk(net, sk, sk->sk_family); > + rc = smc_create_clcsk(net, sk, sk->sk_family); > + if (rc) > + return rc; > + > + switch (sk->sk_family) { > + case AF_INET: > + sock_lock_init_class_and_name(sk, "slock-AF_INET-SMC", > + &smc_slock_keys[0], > + "sk_lock-AF_INET-SMC", > + &smc_keys[0]); > + break; > + case AF_INET6: > + sock_lock_init_class_and_name(sk, "slock-AF_INET6-SMC", > + &smc_slock_keys[1], > + "sk_lock-AF_INET6-SMC", > + &smc_keys[1]); > + break; > + default: > + WARN_ON_ONCE(1); > + } > + > + return 0; > } > > I took a look at Cong Wang's patches and it seems that switching from rtnl to rcu is unfeasible since gtp_encap_enable_socket is called by rtnl_newlink_create through a function pointer and the rtnl_mutex acquisition happens somewhere up the call stack. Since many other newlink() functions rely on the assumption of rtnl lock being acquired, it seems quite unfeasible to use this approach in this case. I tried your suggested solution and it got rid of the bug. I made small changes to it for readability and will resend it. Should lockdep annotations also be added to smc_sock_alloc since smc_sk_init() is also called there as well or is it guaranteed to have non-conflicting lockdep labels?
© 2016 - 2024 Red Hat, Inc.