include/net/sock.h | 2 ++ net/netlink/af_netlink.c | 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-)
The kernel may crash when deleting a genetlink family if there are still
listeners for that family:
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c000000000c080bc] netlink_update_socket_mc+0x3c/0xc0
LR [c000000000c0f764] __netlink_clear_multicast_users+0x74/0xc0
Call Trace:
__netlink_clear_multicast_users+0x74/0xc0
genl_unregister_family+0xd4/0x2d0
Change the unsafe loop on the list to a safe one, because inside the
loop there is an element removal from this list.
Fixes: b8273570f802 ("genetlink: fix netns vs. netlink table locking (2)")\
Cc: stable@vger.kernel.org
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
---
include/net/sock.h | 2 ++
net/netlink/af_netlink.c | 3 ++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index c58ca8dd561b..eec77a18602a 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -894,6 +894,8 @@ static inline void sk_add_bind_node(struct sock *sk,
hlist_for_each_entry_safe(__sk, tmp, list, sk_node)
#define sk_for_each_bound(__sk, list) \
hlist_for_each_entry(__sk, list, sk_bind_node)
+#define sk_for_each_bound_safe(__sk, tmp, list) \
+ hlist_for_each_entry_safe(__sk, tmp, list, sk_bind_node)
/**
* sk_for_each_entry_offset_rcu - iterate over a list at a given struct offset
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 0b7a89db3ab7..0a9287fadb47 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2136,8 +2136,9 @@ void __netlink_clear_multicast_users(struct sock *ksk, unsigned int group)
{
struct sock *sk;
struct netlink_table *tbl = &nl_table[ksk->sk_protocol];
+ struct hlist_node *tmp;
- sk_for_each_bound(sk, &tbl->mc_list)
+ sk_for_each_bound_safe(sk, tmp, &tbl->mc_list)
netlink_update_socket_mc(nlk_sk(sk), group, 0);
}
--
2.40.1
On Tue, 1 Oct 2024 14:58:28 +0300 Anastasia Kovaleva wrote: > The kernel may crash when deleting a genetlink family if there are still > listeners for that family: Could you add a selftest? Should be fairly easy using YNL, ncdevmem is the only user so far. > Oops: Kernel access of bad area, sig: 11 [#1] > ... > NIP [c000000000c080bc] netlink_update_socket_mc+0x3c/0xc0 > LR [c000000000c0f764] __netlink_clear_multicast_users+0x74/0xc0 > Call Trace: > __netlink_clear_multicast_users+0x74/0xc0 > genl_unregister_family+0xd4/0x2d0 > > Change the unsafe loop on the list to a safe one, because inside the > loop there is an element removal from this list. > > Fixes: b8273570f802 ("genetlink: fix netns vs. netlink table locking (2)")\ nit: trailing \ at the end of the line here
On Wed, 2 Oct 2024 06:02:40 -0700 Jakub Kicinski wrote: > Could you add a selftest? Should be fairly easy using YNL, ncdevmem is > the only user so far. Thank you for your review. Originally I've stumbled upon this bug while developing a functionality in the iscsit driver to inform the userspace about a failed iscsi session authentication. I am not familiar with the netdev test tools; with YNL and ncdevmem particularly. The change seems trivial enough. It fixes an obvious bug with the deletion of an element of a list that you are currently iterating over. I assume that it wasn’t reproduced often because there are few users of the genetlink and usually there is no listeners of a family while removing a driver that uses it. If it is really necessary, it would take me a couple of days to make a test. If not, I can send a v3 patch that just removes a trailing \.
On Wed, 2 Oct 2024 17:53:18 +0000 Anastasia Kovaleva wrote: > If it is really necessary, it would take me a couple of days to make a test. The only tricky part, I think, is finding a family which can be easily unloaded for the test. I don't see any good candidates, so fine.
© 2016 - 2024 Red Hat, Inc.