net/ipv4/ipmr_base.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
following code flow, the RCU read lock is not held, causing the
following error when `RCU_PROVE` is not held. The same problem might
show up in the IPv6 code path.
6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G E N
-----------------------------
net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!!
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by RetransmitAggre/3519:
#0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290
#1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90
stack backtrace:
lockdep_rcu_suspicious
mr_table_dump
ipmr_rtm_dumproute
rtnl_dump_all
rtnl_dumpit
netlink_dump
__netlink_dump_start
rtnetlink_rcv_msg
netlink_rcv_skb
netlink_unicast
netlink_sendmsg
This is not a problem per see, since the RTNL lock is held here, so, it
is safe to iterate in the list without the RCU read lock, as suggested
by Eric.
To alleviate the concern, modify the code to use
list_for_each_entry_rcu() with the RTNL-held argument.
The annotation will raise an error only if RTNL or RCU read lock are
missing during iteration, signaling a legitimate problem, otherwise it
will avoid this false positive.
This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls
this function as well.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Instead of getting an RCU read lock, rely on rtnl mutex (Eric)
- Link to v1: https://lore.kernel.org/r/20241107-ipmr_rcu-v1-1-ad0cba8dffed@debian.org
- Still sending it against `net`, so, since this warning is annoying
---
net/ipv4/ipmr_base.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/ipmr_base.c b/net/ipv4/ipmr_base.c
index 271dc03fc6dbd9b35db4d5782716679134f225e4..f0af12a2f70bcdf5ba54321bf7ebebe798318abb 100644
--- a/net/ipv4/ipmr_base.c
+++ b/net/ipv4/ipmr_base.c
@@ -310,7 +310,8 @@ int mr_table_dump(struct mr_table *mrt, struct sk_buff *skb,
if (filter->filter_set)
flags |= NLM_F_DUMP_FILTERED;
- list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list) {
+ list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list,
+ lockdep_rtnl_is_held()) {
if (e < s_e)
goto next_entry;
if (filter->dev &&
---
base-commit: 25d70702142ac2115e75e01a0a985c6ea1d78033
change-id: 20241107-ipmr_rcu-291d85400b16
Best regards,
--
Breno Leitao <leitao@debian.org>
On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote: > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the > following code flow, the RCU read lock is not held, causing the > following error when `RCU_PROVE` is not held. The same problem might > show up in the IPv6 code path. good start, I hope someone can fix the gazillion warnings the CI is hitting on the table accesses :)
Hello Jakub, On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote: > On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote: > > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the > > following code flow, the RCU read lock is not held, causing the > > following error when `RCU_PROVE` is not held. The same problem might > > show up in the IPv6 code path. > > good start, I hope someone can fix the gazillion warnings the CI > is hitting on the table accesses :) If you have an updated list, I'd be happy to take a look. Last time, all the problems I found were being discussed upstream already. I am wondering if they didn't land upstream, or, if you have warnings that are not being currently fixed.
On Thu, 14 Nov 2024 00:55:57 -0800 Breno Leitao wrote: > On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote: > > On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote: > > > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the > > > following code flow, the RCU read lock is not held, causing the > > > following error when `RCU_PROVE` is not held. The same problem might > > > show up in the IPv6 code path. > > > > good start, I hope someone can fix the gazillion warnings the CI > > is hitting on the table accesses :) > > If you have an updated list, I'd be happy to take a look. > > Last time, all the problems I found were being discussed upstream > already. I am wondering if they didn't land upstream, or, if you have > warnings that are not being currently fixed. https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/859762/82-router-multicast-sh/stderr
On Thu, Nov 14, 2024 at 07:03:08AM -0800, Jakub Kicinski wrote: > On Thu, 14 Nov 2024 00:55:57 -0800 Breno Leitao wrote: > > On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote: > > > On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote: > > > > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the > > > > following code flow, the RCU read lock is not held, causing the > > > > following error when `RCU_PROVE` is not held. The same problem might > > > > show up in the IPv6 code path. > > > > > > good start, I hope someone can fix the gazillion warnings the CI > > > is hitting on the table accesses :) > > > > If you have an updated list, I'd be happy to take a look. > > > > Last time, all the problems I found were being discussed upstream > > already. I am wondering if they didn't land upstream, or, if you have > > warnings that are not being currently fixed. > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/859762/82-router-multicast-sh/stderr This one seems to be discussed in the following thread already. https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote: > This one seems to be discussed in the following thread already. > > https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/ That's why it rung a bell.. Stefan, are you planning to continue with the series?
> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote: >> This one seems to be discussed in the following thread already. >> >> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/ > > That's why it rung a bell.. > Stefan, are you planning to continue with the series? Yes, sorry for the delay, went on vacation and was busy with other tasks, but next week I plan to continue (i.e. refactor using refcount_t). Kind regards, Stefan
On 11/15/24 17:07, Stefan Wiehler wrote: >> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote: >>> This one seems to be discussed in the following thread already. >>> >>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/ >> >> That's why it rung a bell.. >> Stefan, are you planning to continue with the series? > > Yes, sorry for the delay, went on vacation and was busy with other tasks, but > next week I plan to continue (i.e. refactor using refcount_t). I forgot about that series and spent a little time investigating the scenario. I think we don't need a refcount: the tables are freed only at netns cleanup time, so the netns refcount is enough to guarantee that the tables are not deleted when escaping the RCU section. Some debug assertions could help clarify, document and make the schema more robust to later change. Side note, I think we need to drop the RCU lock moved by: https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/ as the seqfile core can call blocking functions - alloc(GFP_KERNEL) - between ->start() and ->stop(). The issue is pre-existent to that patch, and even to the patch introducing the original RCU() - the old read_lock() created an illegal atomic scope - but I think we should address it while touching this code. I have some patches implementing the above but I have hard times testing vs net/forwarding, as the forwarding ST setup is eluding me - with mausezahn internal errors. @Jakub: do you have by chance any cheap tip handy about the forwarding self-tests setup? Thanks, Paolo
On 11/15/24 17:55, Paolo Abeni wrote: > On 11/15/24 17:07, Stefan Wiehler wrote: >>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote: >>>> This one seems to be discussed in the following thread already. >>>> >>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/ >>> >>> That's why it rung a bell.. >>> Stefan, are you planning to continue with the series? >> >> Yes, sorry for the delay, went on vacation and was busy with other tasks, but >> next week I plan to continue (i.e. refactor using refcount_t). > > I forgot about that series and spent a little time investigating the > scenario. > > I think we don't need a refcount: the tables are freed only at netns > cleanup time, so the netns refcount is enough to guarantee that the > tables are not deleted when escaping the RCU section. > > Some debug assertions could help clarify, document and make the schema > more robust to later change. > > Side note, I think we need to drop the RCU lock moved by: > > https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/ > > as the seqfile core can call blocking functions - alloc(GFP_KERNEL) - > between ->start() and ->stop(). > > The issue is pre-existent to that patch, and even to the patch > introducing the original RCU() - the old read_lock() created an illegal > atomic scope - but I think we should address it while touching this code. @Stefan: are you ok if I go ahead with this work, or do you prefer finish it yourself? Thanks, Paolo
> On 11/15/24 17:55, Paolo Abeni wrote: >> On 11/15/24 17:07, Stefan Wiehler wrote: >>>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote: >>>>> This one seems to be discussed in the following thread already. >>>>> >>>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/ >>>> >>>> That's why it rung a bell.. >>>> Stefan, are you planning to continue with the series? >>> >>> Yes, sorry for the delay, went on vacation and was busy with other tasks, but >>> next week I plan to continue (i.e. refactor using refcount_t). >> >> I forgot about that series and spent a little time investigating the >> scenario. >> >> I think we don't need a refcount: the tables are freed only at netns >> cleanup time, so the netns refcount is enough to guarantee that the >> tables are not deleted when escaping the RCU section. >> >> Some debug assertions could help clarify, document and make the schema >> more robust to later change. >> >> Side note, I think we need to drop the RCU lock moved by: >> >> https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/ >> >> as the seqfile core can call blocking functions - alloc(GFP_KERNEL) - >> between ->start() and ->stop(). >> >> The issue is pre-existent to that patch, and even to the patch >> introducing the original RCU() - the old read_lock() created an illegal >> atomic scope - but I think we should address it while touching this code. > > @Stefan: are you ok if I go ahead with this work, or do you prefer > finish it yourself? Please go ahead, I have neither the expertise in the net subsystem nor the time to ramp-up (since this is just a side finding for us right now) to proceed with your proposal. I'll follow the discussion though and hope to learn something along the way! Kind regards, Stefan
On Fri, 15 Nov 2024 17:55:11 +0100 Paolo Abeni wrote: > @Jakub: do you have by chance any cheap tip handy about the forwarding > self-tests setup? I presume you mean how to get all the weird tools it requires in place? This is all the things we build on the worker: https://github.com/linux-netdev/nipa/blob/main/deploy/contest/remote/worker-setup.sh For mcast routing you only need a handful of those (mtools, smcroute, ndisc6?), but we don't really track which is needed where :S Ideally we'd use some image building to compose this automatically.. one day.
On 11/8/24 7:08 AM, Breno Leitao wrote: > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the > following code flow, the RCU read lock is not held, causing the > following error when `RCU_PROVE` is not held. The same problem might > show up in the IPv6 code path. > > 6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G E N > ----------------------------- > net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!! > > rcu_scheduler_active = 2, debug_locks = 1 > 2 locks held by RetransmitAggre/3519: > #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290 > #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90 > > stack backtrace: > lockdep_rcu_suspicious > mr_table_dump > ipmr_rtm_dumproute > rtnl_dump_all > rtnl_dumpit > netlink_dump > __netlink_dump_start > rtnetlink_rcv_msg > netlink_rcv_skb > netlink_unicast > netlink_sendmsg > > This is not a problem per see, since the RTNL lock is held here, so, it > is safe to iterate in the list without the RCU read lock, as suggested > by Eric. > > To alleviate the concern, modify the code to use > list_for_each_entry_rcu() with the RTNL-held argument. > > The annotation will raise an error only if RTNL or RCU read lock are > missing during iteration, signaling a legitimate problem, otherwise it > will avoid this false positive. > > This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls > this function as well. > > Signed-off-by: Breno Leitao <leitao@debian.org> > --- > Changes in v2: > - Instead of getting an RCU read lock, rely on rtnl mutex (Eric) > - Link to v1: https://lore.kernel.org/r/20241107-ipmr_rcu-v1-1-ad0cba8dffed@debian.org > - Still sending it against `net`, so, since this warning is annoying > --- > net/ipv4/ipmr_base.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > Reviewed-by: David Ahern <dsahern@kernel.org>
© 2016 - 2024 Red Hat, Inc.