[PATCH net v4 0/5] Lock RCU before calling ip6mr_get_table()

Stefan Wiehler posted 5 patches 1 month, 2 weeks ago
There is a newer version of this series
net/ipv6/ip6mr.c | 105 +++++++++++++++++++++++++++++------------------
1 file changed, 66 insertions(+), 39 deletions(-)
[PATCH net v4 0/5] Lock RCU before calling ip6mr_get_table()
Posted by Stefan Wiehler 1 month, 2 weeks ago
Lock RCU before calling ip6mr_get_table() in several ip6mr functions.

v4:
  - mention in commit message that ip6mr_vif_seq_stop() would be called
    in case ip6mr_vif_seq_start() returns an error
  - fix unitialised use of mrt variable
  - revert commit b6dd5acde3f1 ("ipv6: Fix suspicious RCU usage warning
    in ip6mr")
v3: https://patchwork.kernel.org/project/netdevbpf/patch/20241010090741.1980100-2-stefan.wiehler@nokia.com/
  - split into separate patches
v2: https://patchwork.kernel.org/project/netdevbpf/patch/20241001100119.230711-2-stefan.wiehler@nokia.com/
  - rebase on top of net tree
  - add Fixes tag
  - refactor out paths
v1: https://patchwork.kernel.org/project/netdevbpf/patch/20240605195355.363936-1-oss@malat.biz/

Stefan Wiehler (5):
  ip6mr: Lock RCU before ip6mr_get_table() call in ip6mr_vif_seq_start()
  ip6mr: Lock RCU before ip6mr_get_table() call in ip6mr_ioctl()
  ip6mr: Lock RCU before ip6mr_get_table() call in ip6mr_compat_ioctl()
  ip6mr: Lock RCU before ip6mr_get_table() call in ip6mr_get_route()
  Revert "ipv6: Fix suspicious RCU usage warning in ip6mr"

 net/ipv6/ip6mr.c | 105 +++++++++++++++++++++++++++++------------------
 1 file changed, 66 insertions(+), 39 deletions(-)

-- 
2.42.0
Re: [PATCH net v4 0/5] Lock RCU before calling ip6mr_get_table()
Posted by Eric Dumazet 1 month, 2 weeks ago
On Fri, Oct 11, 2024 at 9:48 AM Stefan Wiehler <stefan.wiehler@nokia.com> wrote:
>
> Lock RCU before calling ip6mr_get_table() in several ip6mr functions.
>
> v4:
>   - mention in commit message that ip6mr_vif_seq_stop() would be called
>     in case ip6mr_vif_seq_start() returns an error
>   - fix unitialised use of mrt variable
>   - revert commit b6dd5acde3f1 ("ipv6: Fix suspicious RCU usage warning
>     in ip6mr")

Hi Stefan. I think a v5 is needed :)

Please double check your syslog

[   18.149447] =============================
[   18.149471] WARNING: suspicious RCU usage
[   18.149649] 6.12.0-rc2-virtme #1155 Not tainted
[   18.149726] -----------------------------
[   18.149747] net/ipv6/ip6mr.c:131 RCU-list traversed in non-reader section!!
[   18.149792]
other info that might help us debug this:

[   18.149824]
rcu_scheduler_active = 2, debug_locks = 1
[   18.150050] 1 lock held by swapper/0/1:
[   18.150090] #0: ffffffff95b36390 (pernet_ops_rwsem){+.+.}-{3:3},
at: register_pernet_subsys (net/core/net_namespace.c:1356)
[   18.151482]
stack backtrace:
[   18.151716] CPU: 12 UID: 0 PID: 1 Comm: swapper/0 Not tainted
6.12.0-rc2-virtme #1155
[   18.151809] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   18.151982] Call Trace:
[   18.152122]  <TASK>
[   18.152411] dump_stack_lvl (lib/dump_stack.c:123)
[   18.152411] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
[   18.152411] ip6mr_get_table (net/ipv6/ip6mr.c:131 (discriminator 9))
[   18.152411] ip6mr_net_init (net/ipv6/ip6mr.c:384
net/ipv6/ip6mr.c:238 net/ipv6/ip6mr.c:1317 net/ipv6/ip6mr.c:1309)
[   18.152411] ops_init (net/core/net_namespace.c:139)
[   18.152411] register_pernet_operations
(net/core/net_namespace.c:1239 net/core/net_namespace.c:1315)
[   18.152411] register_pernet_subsys (net/core/net_namespace.c:1357)
[   18.152411] ip6_mr_init (net/ipv6/ip6mr.c:1379)
[   18.152411] inet6_init (net/ipv6/af_inet6.c:1137)
[   18.152411] ? __pfx_inet6_init (net/ipv6/af_inet6.c:1076)
[   18.152411] do_one_initcall (init/main.c:1269)
[   18.152411] ? _raw_spin_unlock_irqrestore
(./arch/x86/include/asm/irqflags.h:42
./arch/x86/include/asm/irqflags.h:97
./arch/x86/include/asm/irqflags.h:155
./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
[   18.152411] kernel_init_freeable (init/main.c:1330 (discriminator
1) init/main.c:1347 (discriminator 1) init/main.c:1366 (discriminator
1) init/main.c:1580 (discriminator 1))
[   18.152411] ? __pfx_kernel_init (init/main.c:1461)
[   18.152411] kernel_init (init/main.c:1471)
[   18.152411] ret_from_fork (arch/x86/kernel/process.c:153)
[   18.152411] ? __pfx_kernel_init (init/main.c:1461)
[   18.152411] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
[   18.152411]  </TASK>

Thank you.
Re: [PATCH net v4 0/5] Lock RCU before calling ip6mr_get_table()
Posted by Stefan Wiehler 1 month, 1 week ago
> Hi Stefan. I think a v5 is needed :)
> 
> Please double check your syslog
> 
> [   18.149447] =============================
> [   18.149471] WARNING: suspicious RCU usage
> [   18.149649] 6.12.0-rc2-virtme #1155 Not tainted
> [   18.149726] -----------------------------
> [   18.149747] net/ipv6/ip6mr.c:131 RCU-list traversed in non-reader section!!
> [   18.149792]
> other info that might help us debug this:
> 
> [   18.149824]
> rcu_scheduler_active = 2, debug_locks = 1
> [   18.150050] 1 lock held by swapper/0/1:
> [   18.150090] #0: ffffffff95b36390 (pernet_ops_rwsem){+.+.}-{3:3},
> at: register_pernet_subsys (net/core/net_namespace.c:1356)
> [   18.151482]
> stack backtrace:
> [   18.151716] CPU: 12 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> 6.12.0-rc2-virtme #1155
> [   18.151809] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [   18.151982] Call Trace:
> [   18.152122]  <TASK>
> [   18.152411] dump_stack_lvl (lib/dump_stack.c:123)
> [   18.152411] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
> [   18.152411] ip6mr_get_table (net/ipv6/ip6mr.c:131 (discriminator 9))
> [   18.152411] ip6mr_net_init (net/ipv6/ip6mr.c:384
> net/ipv6/ip6mr.c:238 net/ipv6/ip6mr.c:1317 net/ipv6/ip6mr.c:1309)
> [   18.152411] ops_init (net/core/net_namespace.c:139)
> [   18.152411] register_pernet_operations
> (net/core/net_namespace.c:1239 net/core/net_namespace.c:1315)
> [   18.152411] register_pernet_subsys (net/core/net_namespace.c:1357)
> [   18.152411] ip6_mr_init (net/ipv6/ip6mr.c:1379)
> [   18.152411] inet6_init (net/ipv6/af_inet6.c:1137)
> [   18.152411] ? __pfx_inet6_init (net/ipv6/af_inet6.c:1076)
> [   18.152411] do_one_initcall (init/main.c:1269)
> [   18.152411] ? _raw_spin_unlock_irqrestore
> (./arch/x86/include/asm/irqflags.h:42
> ./arch/x86/include/asm/irqflags.h:97
> ./arch/x86/include/asm/irqflags.h:155
> ./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
> [   18.152411] kernel_init_freeable (init/main.c:1330 (discriminator
> 1) init/main.c:1347 (discriminator 1) init/main.c:1366 (discriminator
> 1) init/main.c:1580 (discriminator 1))
> [   18.152411] ? __pfx_kernel_init (init/main.c:1461)
> [   18.152411] kernel_init (init/main.c:1471)
> [   18.152411] ret_from_fork (arch/x86/kernel/process.c:153)
> [   18.152411] ? __pfx_kernel_init (init/main.c:1461)
> [   18.152411] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
> [   18.152411]  </TASK>

Thanks, I'm not sure why I missed that one since it also shows up on our v6.1-based kernel.

I went through all the remaining calls of ip6mr_get_table() in ip6mr.c (since
I've picked this topic up from a colleague):

- call in ip6mr_rule_action is safe because fib_rules_lookup() holds RCU lock
- call in ipmr_mfc_seq_start() needs to be in RCU read-side critical section as well
- calls in ip6mr_rtm_(set|get)sockopt() need to be in RCU read-side critical section as well
- call in ip6mr_rtm_getroute() needs to hold RCU read lock earlier as well
- call in ip6mr_rtm_dumproute() is safe because rtnl_register_internal() holds the RTNL lock

I'll prepare a v5; please carefully review as I'm not familar with the codebase...