[PATCH bpf-next v2] net: Fix RCU usage in task_cls_state() for BPF programs

Charalampos Mitrodimas posted 1 patch 4 months ago
There is a newer version of this series
net/core/netclassid_cgroup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH bpf-next v2] net: Fix RCU usage in task_cls_state() for BPF programs
Posted by Charalampos Mitrodimas 4 months ago
The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
types") made bpf_get_cgroup_classid_curr helper available to all BPF
program types, not just networking programs.

This helper calls __task_get_classid() which internally calls
task_cls_state() requiring rcu_read_lock_bh_held(). This works in
networking/tc context where RCU BH is held, but triggers an RCU
warning when called from other contexts like BPF syscall programs that
run under rcu_read_lock_trace():

  WARNING: suspicious RCU usage
  6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
  -----------------------------
  net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!

Fix this by also accepting rcu_read_lock_trace_held() as a valid RCU
context in the task_cls_state() function. This is safe because BPF
programs are non-sleepable and task_cls_state() is only doing an RCU
dereference to get the classid.

Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
---
Changes in v2:
- Fix RCU usage in task_cls_state() instead of BPF helper
- Add rcu_read_lock_trace_held() check to accept trace RCU as valdi
  context
- Drop the approach of using task_cls_classid() which has in_interrupt()
  check
- Link to v1: https://lore.kernel.org/r/20250608-rcu-fix-task_cls_state-v1-1-2a2025b4603b@posteo.net
---
 net/core/netclassid_cgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
index d22f0919821e931fbdedf5a8a7a2998d59d73978..df86f82d747ac40e99597d6f2d921e8cc2834e64 100644
--- a/net/core/netclassid_cgroup.c
+++ b/net/core/netclassid_cgroup.c
@@ -21,7 +21,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
 struct cgroup_cls_state *task_cls_state(struct task_struct *p)
 {
 	return css_cls_state(task_css_check(p, net_cls_cgrp_id,
-					    rcu_read_lock_bh_held()));
+					    rcu_read_lock_bh_held() ||
+					    rcu_read_lock_trace_held()));
 }
 EXPORT_SYMBOL_GPL(task_cls_state);
 

---
base-commit: 079e5c56a5c41d285068939ff7b0041ab10386fa
change-id: 20250608-rcu-fix-task_cls_state-0ed73f437d1e

Best regards,
-- 
Charalampos Mitrodimas <charmitro@posteo.net>
Re: [PATCH bpf-next v2] net: Fix RCU usage in task_cls_state() for BPF programs
Posted by Alexei Starovoitov 4 months ago
On Wed, Jun 11, 2025 at 2:04 AM Charalampos Mitrodimas
<charmitro@posteo.net> wrote:
>
> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
> types") made bpf_get_cgroup_classid_curr helper available to all BPF
> program types, not just networking programs.
>
> This helper calls __task_get_classid() which internally calls
> task_cls_state() requiring rcu_read_lock_bh_held(). This works in
> networking/tc context where RCU BH is held, but triggers an RCU
> warning when called from other contexts like BPF syscall programs that
> run under rcu_read_lock_trace():
>
>   WARNING: suspicious RCU usage
>   6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>   -----------------------------
>   net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>
> Fix this by also accepting rcu_read_lock_trace_held() as a valid RCU
> context in the task_cls_state() function. This is safe because BPF
> programs are non-sleepable and task_cls_state() is only doing an RCU
> dereference to get the classid.
>
> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
> ---
> Changes in v2:
> - Fix RCU usage in task_cls_state() instead of BPF helper
> - Add rcu_read_lock_trace_held() check to accept trace RCU as valdi
>   context
> - Drop the approach of using task_cls_classid() which has in_interrupt()
>   check
> - Link to v1: https://lore.kernel.org/r/20250608-rcu-fix-task_cls_state-v1-1-2a2025b4603b@posteo.net
> ---
>  net/core/netclassid_cgroup.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
> index d22f0919821e931fbdedf5a8a7a2998d59d73978..df86f82d747ac40e99597d6f2d921e8cc2834e64 100644
> --- a/net/core/netclassid_cgroup.c
> +++ b/net/core/netclassid_cgroup.c
> @@ -21,7 +21,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
>  struct cgroup_cls_state *task_cls_state(struct task_struct *p)
>  {
>         return css_cls_state(task_css_check(p, net_cls_cgrp_id,
> -                                           rcu_read_lock_bh_held()));
> +                                           rcu_read_lock_bh_held() ||
> +                                           rcu_read_lock_trace_held()));

This is incomplete. It only addresses one particular syzbot report.
It needs to include rcu_read_lock_held() as well.

pw-bot: cr
Re: [PATCH bpf-next v2] net: Fix RCU usage in task_cls_state() for BPF programs
Posted by Charalampos Mitrodimas 4 months ago
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Wed, Jun 11, 2025 at 2:04 AM Charalampos Mitrodimas
> <charmitro@posteo.net> wrote:
>>
>> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> program types, not just networking programs.
>>
>> This helper calls __task_get_classid() which internally calls
>> task_cls_state() requiring rcu_read_lock_bh_held(). This works in
>> networking/tc context where RCU BH is held, but triggers an RCU
>> warning when called from other contexts like BPF syscall programs that
>> run under rcu_read_lock_trace():
>>
>>   WARNING: suspicious RCU usage
>>   6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>>   -----------------------------
>>   net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>>
>> Fix this by also accepting rcu_read_lock_trace_held() as a valid RCU
>> context in the task_cls_state() function. This is safe because BPF
>> programs are non-sleepable and task_cls_state() is only doing an RCU
>> dereference to get the classid.
>>
>> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>> ---
>> Changes in v2:
>> - Fix RCU usage in task_cls_state() instead of BPF helper
>> - Add rcu_read_lock_trace_held() check to accept trace RCU as valdi
>>   context
>> - Drop the approach of using task_cls_classid() which has in_interrupt()
>>   check
>> - Link to v1: https://lore.kernel.org/r/20250608-rcu-fix-task_cls_state-v1-1-2a2025b4603b@posteo.net
>> ---
>>  net/core/netclassid_cgroup.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
>> index d22f0919821e931fbdedf5a8a7a2998d59d73978..df86f82d747ac40e99597d6f2d921e8cc2834e64 100644
>> --- a/net/core/netclassid_cgroup.c
>> +++ b/net/core/netclassid_cgroup.c
>> @@ -21,7 +21,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
>>  struct cgroup_cls_state *task_cls_state(struct task_struct *p)
>>  {
>>         return css_cls_state(task_css_check(p, net_cls_cgrp_id,
>> -                                           rcu_read_lock_bh_held()));
>> +                                           rcu_read_lock_bh_held() ||
>> +                                           rcu_read_lock_trace_held()));
>
> This is incomplete. It only addresses one particular syzbot report.
> It needs to include rcu_read_lock_held() as well.

To which other report you are refering to?

>
> pw-bot: cr