[v2] ipmr: Fix access to mfc_cache_list without lock held

[PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Breno Leitao 1 year, 3 months ago

Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
following code flow, the RCU read lock is not held, causing the
following error when `RCU_PROVE` is not held. The same problem might
show up in the IPv6 code path.

	6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G            E    N
	-----------------------------
	net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!!

	rcu_scheduler_active = 2, debug_locks = 1
		   2 locks held by RetransmitAggre/3519:
		    #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290
		    #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90

	stack backtrace:
		    lockdep_rcu_suspicious
		    mr_table_dump
		    ipmr_rtm_dumproute
		    rtnl_dump_all
		    rtnl_dumpit
		    netlink_dump
		    __netlink_dump_start
		    rtnetlink_rcv_msg
		    netlink_rcv_skb
		    netlink_unicast
		    netlink_sendmsg

This is not a problem per see, since the RTNL lock is held here, so, it
is safe to iterate in the list without the RCU read lock, as suggested
by Eric.

To alleviate the concern, modify the code to use
list_for_each_entry_rcu() with the RTNL-held argument.

The annotation will raise an error only if RTNL or RCU read lock are
missing during iteration, signaling a legitimate problem, otherwise it
will avoid this false positive.

This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls
this function as well.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Instead of getting an RCU read lock, rely on rtnl mutex (Eric)
- Link to v1: https://lore.kernel.org/r/20241107-ipmr_rcu-v1-1-ad0cba8dffed@debian.org
- Still sending it against `net`, so, since this warning is annoying
---
 net/ipv4/ipmr_base.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/ipmr_base.c b/net/ipv4/ipmr_base.c
index 271dc03fc6dbd9b35db4d5782716679134f225e4..f0af12a2f70bcdf5ba54321bf7ebebe798318abb 100644
--- a/net/ipv4/ipmr_base.c
+++ b/net/ipv4/ipmr_base.c
@@ -310,7 +310,8 @@ int mr_table_dump(struct mr_table *mrt, struct sk_buff *skb,
 	if (filter->filter_set)
 		flags |= NLM_F_DUMP_FILTERED;
 
-	list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list) {
+	list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list,
+				lockdep_rtnl_is_held()) {
 		if (e < s_e)
 			goto next_entry;
 		if (filter->dev &&

---
base-commit: 25d70702142ac2115e75e01a0a985c6ea1d78033
change-id: 20241107-ipmr_rcu-291d85400b16

Best regards,
-- 
Breno Leitao <leitao@debian.org>

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Jakub Kicinski 1 year, 2 months ago

On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:
> Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> following code flow, the RCU read lock is not held, causing the
> following error when `RCU_PROVE` is not held. The same problem might
> show up in the IPv6 code path.

good start, I hope someone can fix the gazillion warnings the CI 
is hitting on the table accesses :)

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Breno Leitao 1 year, 2 months ago

Hello Jakub,

On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote:
> On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:
> > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> > following code flow, the RCU read lock is not held, causing the
> > following error when `RCU_PROVE` is not held. The same problem might
> > show up in the IPv6 code path.
> 
> good start, I hope someone can fix the gazillion warnings the CI 
> is hitting on the table accesses :)

If you have an updated list, I'd be happy to take a look.

Last time, all the problems I found were being discussed upstream
already. I am wondering if they didn't land upstream, or, if you have
warnings that are not being currently fixed.

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Jakub Kicinski 1 year, 2 months ago

On Thu, 14 Nov 2024 00:55:57 -0800 Breno Leitao wrote:
> On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote:
> > On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:  
> > > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> > > following code flow, the RCU read lock is not held, causing the
> > > following error when `RCU_PROVE` is not held. The same problem might
> > > show up in the IPv6 code path.  
> > 
> > good start, I hope someone can fix the gazillion warnings the CI 
> > is hitting on the table accesses :)  
> 
> If you have an updated list, I'd be happy to take a look.
> 
> Last time, all the problems I found were being discussed upstream
> already. I am wondering if they didn't land upstream, or, if you have
> warnings that are not being currently fixed.

https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/859762/82-router-multicast-sh/stderr

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Breno Leitao 1 year, 2 months ago

On Thu, Nov 14, 2024 at 07:03:08AM -0800, Jakub Kicinski wrote:
> On Thu, 14 Nov 2024 00:55:57 -0800 Breno Leitao wrote:
> > On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote:
> > > On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:  
> > > > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> > > > following code flow, the RCU read lock is not held, causing the
> > > > following error when `RCU_PROVE` is not held. The same problem might
> > > > show up in the IPv6 code path.  
> > > 
> > > good start, I hope someone can fix the gazillion warnings the CI 
> > > is hitting on the table accesses :)  
> > 
> > If you have an updated list, I'd be happy to take a look.
> > 
> > Last time, all the problems I found were being discussed upstream
> > already. I am wondering if they didn't land upstream, or, if you have
> > warnings that are not being currently fixed.
> 
> https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/859762/82-router-multicast-sh/stderr

This one seems to be discussed in the following thread already.

https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Jakub Kicinski 1 year, 2 months ago

On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
> This one seems to be discussed in the following thread already.
> 
> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/

That's why it rung a bell..
Stefan, are you planning to continue with the series?

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Stefan Wiehler 1 year, 2 months ago

> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>> This one seems to be discussed in the following thread already.
>>
>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
> 
> That's why it rung a bell..
> Stefan, are you planning to continue with the series?

Yes, sorry for the delay, went on vacation and was busy with other tasks, but
next week I plan to continue (i.e. refactor using refcount_t).

Kind regards,

Stefan

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Paolo Abeni 1 year, 2 months ago

On 11/15/24 17:07, Stefan Wiehler wrote:
>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>>> This one seems to be discussed in the following thread already.
>>>
>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
>>
>> That's why it rung a bell..
>> Stefan, are you planning to continue with the series?
> 
> Yes, sorry for the delay, went on vacation and was busy with other tasks, but
> next week I plan to continue (i.e. refactor using refcount_t).

I forgot about that series and spent a little time investigating the
scenario.

I think we don't need a refcount: the tables are freed only at netns
cleanup time, so the netns refcount is enough to guarantee that the
tables are not deleted when escaping the RCU section.

Some debug assertions could help clarify, document and make the schema
more robust to later change.

Side note, I think we need to drop the RCU lock moved by:

https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/

as the seqfile core can call blocking functions - alloc(GFP_KERNEL) -
between ->start() and ->stop().

The issue is pre-existent to that patch, and even to the patch
introducing the original RCU() - the old read_lock() created an illegal
atomic scope - but I think we should address it while touching this code.

I have some patches implementing the above but I have hard times testing
vs net/forwarding, as the forwarding ST setup is eluding me - with
mausezahn internal errors.

@Jakub: do you have by chance any cheap tip handy about the forwarding
self-tests setup?

Thanks,

Paolo

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Paolo Abeni 1 year, 2 months ago

On 11/15/24 17:55, Paolo Abeni wrote:
> On 11/15/24 17:07, Stefan Wiehler wrote:
>>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>>>> This one seems to be discussed in the following thread already.
>>>>
>>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
>>>
>>> That's why it rung a bell..
>>> Stefan, are you planning to continue with the series?
>>
>> Yes, sorry for the delay, went on vacation and was busy with other tasks, but
>> next week I plan to continue (i.e. refactor using refcount_t).
> 
> I forgot about that series and spent a little time investigating the
> scenario.
> 
> I think we don't need a refcount: the tables are freed only at netns
> cleanup time, so the netns refcount is enough to guarantee that the
> tables are not deleted when escaping the RCU section.
> 
> Some debug assertions could help clarify, document and make the schema
> more robust to later change.
> 
> Side note, I think we need to drop the RCU lock moved by:
> 
> https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/
> 
> as the seqfile core can call blocking functions - alloc(GFP_KERNEL) -
> between ->start() and ->stop().
> 
> The issue is pre-existent to that patch, and even to the patch
> introducing the original RCU() - the old read_lock() created an illegal
> atomic scope - but I think we should address it while touching this code.

@Stefan: are you ok if I go ahead with this work, or do you prefer
finish it yourself?

Thanks,

Paolo

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Stefan Wiehler 1 year, 2 months ago

> On 11/15/24 17:55, Paolo Abeni wrote:
>> On 11/15/24 17:07, Stefan Wiehler wrote:
>>>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>>>>> This one seems to be discussed in the following thread already.
>>>>>
>>>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
>>>>
>>>> That's why it rung a bell..
>>>> Stefan, are you planning to continue with the series?
>>>
>>> Yes, sorry for the delay, went on vacation and was busy with other tasks, but
>>> next week I plan to continue (i.e. refactor using refcount_t).
>>
>> I forgot about that series and spent a little time investigating the
>> scenario.
>>
>> I think we don't need a refcount: the tables are freed only at netns
>> cleanup time, so the netns refcount is enough to guarantee that the
>> tables are not deleted when escaping the RCU section.
>>
>> Some debug assertions could help clarify, document and make the schema
>> more robust to later change.
>>
>> Side note, I think we need to drop the RCU lock moved by:
>>
>> https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/
>>
>> as the seqfile core can call blocking functions - alloc(GFP_KERNEL) -
>> between ->start() and ->stop().
>>
>> The issue is pre-existent to that patch, and even to the patch
>> introducing the original RCU() - the old read_lock() created an illegal
>> atomic scope - but I think we should address it while touching this code.
> 
> @Stefan: are you ok if I go ahead with this work, or do you prefer
> finish it yourself?

Please go ahead, I have neither the expertise in the net subsystem nor the time
to ramp-up (since this is just a side finding for us right now) to proceed with
your proposal. I'll follow the discussion though and hope to learn something
along the way!

Kind regards,

Stefan

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by Jakub Kicinski 1 year, 2 months ago

On Fri, 15 Nov 2024 17:55:11 +0100 Paolo Abeni wrote:
> @Jakub: do you have by chance any cheap tip handy about the forwarding
> self-tests setup?

I presume you mean how to get all the weird tools it requires in place?

This is all the things we build on the worker:

https://github.com/linux-netdev/nipa/blob/main/deploy/contest/remote/worker-setup.sh

For mcast routing you only need a handful of those (mtools, smcroute,
ndisc6?), but we don't really track which is needed where :S

Ideally we'd use some image building to compose this automatically..
one day.

Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

Posted by David Ahern 1 year, 2 months ago

On 11/8/24 7:08 AM, Breno Leitao wrote:
> Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> following code flow, the RCU read lock is not held, causing the
> following error when `RCU_PROVE` is not held. The same problem might
> show up in the IPv6 code path.
> 
> 	6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G            E    N
> 	-----------------------------
> 	net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!!
> 
> 	rcu_scheduler_active = 2, debug_locks = 1
> 		   2 locks held by RetransmitAggre/3519:
> 		    #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290
> 		    #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90
> 
> 	stack backtrace:
> 		    lockdep_rcu_suspicious
> 		    mr_table_dump
> 		    ipmr_rtm_dumproute
> 		    rtnl_dump_all
> 		    rtnl_dumpit
> 		    netlink_dump
> 		    __netlink_dump_start
> 		    rtnetlink_rcv_msg
> 		    netlink_rcv_skb
> 		    netlink_unicast
> 		    netlink_sendmsg
> 
> This is not a problem per see, since the RTNL lock is held here, so, it
> is safe to iterate in the list without the RCU read lock, as suggested
> by Eric.
> 
> To alleviate the concern, modify the code to use
> list_for_each_entry_rcu() with the RTNL-held argument.
> 
> The annotation will raise an error only if RTNL or RCU read lock are
> missing during iteration, signaling a legitimate problem, otherwise it
> will avoid this false positive.
> 
> This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls
> this function as well.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> Changes in v2:
> - Instead of getting an RCU read lock, rely on rtnl mutex (Eric)
> - Link to v1: https://lore.kernel.org/r/20241107-ipmr_rcu-v1-1-ad0cba8dffed@debian.org
> - Still sending it against `net`, so, since this warning is annoying
> ---
>  net/ipv4/ipmr_base.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>