[PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper

Charalampos Mitrodimas posted 1 patch 6 months, 2 weeks ago
net/core/filter.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Charalampos Mitrodimas 6 months, 2 weeks ago
The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
types") made bpf_get_cgroup_classid_curr helper available to all BPF
program types.  This helper used __task_get_classid() which calls
task_cls_state() that requires rcu_read_lock_bh_held().

This triggers an RCU warning when called from BPF syscall programs
which run under rcu_read_lock_trace():

  WARNING: suspicious RCU usage
  6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
  -----------------------------
  net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!

Fix this by replacing __task_get_classid() with task_cls_classid()
which handles RCU locking internally using regular rcu_read_lock() and
is safe to call from any context.

Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
---
 net/core/filter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
 #ifdef CONFIG_CGROUP_NET_CLASSID
 BPF_CALL_0(bpf_get_cgroup_classid_curr)
 {
-	return __task_get_classid(current);
+	return task_cls_classid(current);
 }
 
 const struct bpf_func_proto bpf_get_cgroup_classid_curr_proto = {

---
base-commit: 079e5c56a5c41d285068939ff7b0041ab10386fa
change-id: 20250608-rcu-fix-task_cls_state-0ed73f437d1e

Best regards,
-- 
Charalampos Mitrodimas <charmitro@posteo.net>
Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Alexei Starovoitov 6 months, 1 week ago
On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
<charmitro@posteo.net> wrote:
>
> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
> types") made bpf_get_cgroup_classid_curr helper available to all BPF
> program types.  This helper used __task_get_classid() which calls
> task_cls_state() that requires rcu_read_lock_bh_held().
>
> This triggers an RCU warning when called from BPF syscall programs
> which run under rcu_read_lock_trace():
>
>   WARNING: suspicious RCU usage
>   6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>   -----------------------------
>   net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>
> Fix this by replacing __task_get_classid() with task_cls_classid()
> which handles RCU locking internally using regular rcu_read_lock() and
> is safe to call from any context.
>
> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
> ---
>  net/core/filter.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>  #ifdef CONFIG_CGROUP_NET_CLASSID
>  BPF_CALL_0(bpf_get_cgroup_classid_curr)
>  {
> -       return __task_get_classid(current);
> +       return task_cls_classid(current);
>  }

Daniel added this helper in
commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
with intention to use it from networking hooks.

But task_cls_classid() has
        if (in_interrupt())
                return 0;

which will trigger in softirq and tc hooks.
So this might break Daniel's use case.

Daniel,
ptal.
Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Daniel Borkmann 6 months, 1 week ago
On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
> On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
> <charmitro@posteo.net> wrote:
>>
>> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> program types.  This helper used __task_get_classid() which calls
>> task_cls_state() that requires rcu_read_lock_bh_held().
>>
>> This triggers an RCU warning when called from BPF syscall programs
>> which run under rcu_read_lock_trace():
>>
>>    WARNING: suspicious RCU usage
>>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>>    -----------------------------
>>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>>
>> Fix this by replacing __task_get_classid() with task_cls_classid()
>> which handles RCU locking internally using regular rcu_read_lock() and
>> is safe to call from any context.
>>
>> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>> ---
>>   net/core/filter.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>>   #ifdef CONFIG_CGROUP_NET_CLASSID
>>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
>>   {
>> -       return __task_get_classid(current);
>> +       return task_cls_classid(current);
>>   }
> 
> Daniel added this helper in
> commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
> with intention to use it from networking hooks.
> 
> But task_cls_classid() has
>          if (in_interrupt())
>                  return 0;
> 
> which will trigger in softirq and tc hooks.
> So this might break Daniel's use case.

Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
a new helper implementation for the more generic, non-networking case which
then internally uses task_cls_classid().

Thanks,
Daniel
Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Alexei Starovoitov 6 months, 1 week ago
On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
> > <charmitro@posteo.net> wrote:
> >>
> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
> >> program types.  This helper used __task_get_classid() which calls
> >> task_cls_state() that requires rcu_read_lock_bh_held().
> >>
> >> This triggers an RCU warning when called from BPF syscall programs
> >> which run under rcu_read_lock_trace():
> >>
> >>    WARNING: suspicious RCU usage
> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
> >>    -----------------------------
> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
> >>
> >> Fix this by replacing __task_get_classid() with task_cls_classid()
> >> which handles RCU locking internally using regular rcu_read_lock() and
> >> is safe to call from any context.
> >>
> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
> >> ---
> >>   net/core/filter.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/net/core/filter.c b/net/core/filter.c
> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
> >> --- a/net/core/filter.c
> >> +++ b/net/core/filter.c
> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
> >>   {
> >> -       return __task_get_classid(current);
> >> +       return task_cls_classid(current);
> >>   }
> >
> > Daniel added this helper in
> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
> > with intention to use it from networking hooks.
> >
> > But task_cls_classid() has
> >          if (in_interrupt())
> >                  return 0;
> >
> > which will trigger in softirq and tc hooks.
> > So this might break Daniel's use case.
>
> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
> a new helper implementation for the more generic, non-networking case which
> then internally uses task_cls_classid().

Instead of forking the helper I think we can :
rcu_read_lock_bh_held() || rcu_read_lock_held()
in task_cls_state().
And that will do it.
Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Charalampos Mitrodimas 6 months, 1 week ago
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>> > <charmitro@posteo.net> wrote:
>> >>
>> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> >> program types.  This helper used __task_get_classid() which calls
>> >> task_cls_state() that requires rcu_read_lock_bh_held().
>> >>
>> >> This triggers an RCU warning when called from BPF syscall programs
>> >> which run under rcu_read_lock_trace():
>> >>
>> >>    WARNING: suspicious RCU usage
>> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>> >>    -----------------------------
>> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>> >>
>> >> Fix this by replacing __task_get_classid() with task_cls_classid()
>> >> which handles RCU locking internally using regular rcu_read_lock() and
>> >> is safe to call from any context.
>> >>
>> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>> >> ---
>> >>   net/core/filter.c | 2 +-
>> >>   1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/net/core/filter.c b/net/core/filter.c
>> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> >> --- a/net/core/filter.c
>> >> +++ b/net/core/filter.c
>> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
>> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
>> >>   {
>> >> -       return __task_get_classid(current);
>> >> +       return task_cls_classid(current);
>> >>   }
>> >
>> > Daniel added this helper in
>> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>> > with intention to use it from networking hooks.
>> >
>> > But task_cls_classid() has
>> >          if (in_interrupt())
>> >                  return 0;
>> >
>> > which will trigger in softirq and tc hooks.
>> > So this might break Daniel's use case.
>>
>> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>> a new helper implementation for the more generic, non-networking case which
>> then internally uses task_cls_classid().
>
> Instead of forking the helper I think we can :
> rcu_read_lock_bh_held() || rcu_read_lock_held()
> in task_cls_state().

I tested your suggestion with,

  rcu_read_lock_bh_held() || rcu_read_lock_held()

but it still triggers the RCU warning because BPF syscall programs use
rcu_read_lock_trace().

Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
checkpatch warning:

  WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code

I think the best solution here would be to add
local_bh_disable()/enable() protection directly in the BPF helper. This
keeps the fix localized to where the problem exists, and avoids
modifying core cgroup RCU.

> And that will do it.
Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Alexei Starovoitov 6 months, 1 week ago
On Tue, Jun 10, 2025 at 8:23 AM Charalampos Mitrodimas
<charmitro@posteo.net> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>
> > On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>
> >> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
> >> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
> >> > <charmitro@posteo.net> wrote:
> >> >>
> >> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
> >> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
> >> >> program types.  This helper used __task_get_classid() which calls
> >> >> task_cls_state() that requires rcu_read_lock_bh_held().
> >> >>
> >> >> This triggers an RCU warning when called from BPF syscall programs
> >> >> which run under rcu_read_lock_trace():
> >> >>
> >> >>    WARNING: suspicious RCU usage
> >> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
> >> >>    -----------------------------
> >> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
> >> >>
> >> >> Fix this by replacing __task_get_classid() with task_cls_classid()
> >> >> which handles RCU locking internally using regular rcu_read_lock() and
> >> >> is safe to call from any context.
> >> >>
> >> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
> >> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
> >> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
> >> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
> >> >> ---
> >> >>   net/core/filter.c | 2 +-
> >> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/net/core/filter.c b/net/core/filter.c
> >> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
> >> >> --- a/net/core/filter.c
> >> >> +++ b/net/core/filter.c
> >> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
> >> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
> >> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
> >> >>   {
> >> >> -       return __task_get_classid(current);
> >> >> +       return task_cls_classid(current);
> >> >>   }
> >> >
> >> > Daniel added this helper in
> >> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
> >> > with intention to use it from networking hooks.
> >> >
> >> > But task_cls_classid() has
> >> >          if (in_interrupt())
> >> >                  return 0;
> >> >
> >> > which will trigger in softirq and tc hooks.
> >> > So this might break Daniel's use case.
> >>
> >> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
> >> a new helper implementation for the more generic, non-networking case which
> >> then internally uses task_cls_classid().
> >
> > Instead of forking the helper I think we can :
> > rcu_read_lock_bh_held() || rcu_read_lock_held()
> > in task_cls_state().
>
> I tested your suggestion with,
>
>   rcu_read_lock_bh_held() || rcu_read_lock_held()
>
> but it still triggers the RCU warning because BPF syscall programs use
> rcu_read_lock_trace().
>
> Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
> checkpatch warning:
>
>   WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code

It's safe to ignore checkpatch in this case.

> I think the best solution here would be to add
> local_bh_disable()/enable() protection directly in the BPF helper. This
> keeps the fix localized to where the problem exists, and avoids
> modifying core cgroup RCU.

That works, but it will add runtime overhead.
I doubt the helper is in a critical path, so I don't mind.
Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Charalampos Mitrodimas 6 months, 1 week ago
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Jun 10, 2025 at 8:23 AM Charalampos Mitrodimas
> <charmitro@posteo.net> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> >>
>> >> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>> >> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>> >> > <charmitro@posteo.net> wrote:
>> >> >>
>> >> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> >> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> >> >> program types.  This helper used __task_get_classid() which calls
>> >> >> task_cls_state() that requires rcu_read_lock_bh_held().
>> >> >>
>> >> >> This triggers an RCU warning when called from BPF syscall programs
>> >> >> which run under rcu_read_lock_trace():
>> >> >>
>> >> >>    WARNING: suspicious RCU usage
>> >> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>> >> >>    -----------------------------
>> >> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>> >> >>
>> >> >> Fix this by replacing __task_get_classid() with task_cls_classid()
>> >> >> which handles RCU locking internally using regular rcu_read_lock() and
>> >> >> is safe to call from any context.
>> >> >>
>> >> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>> >> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> >> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> >> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>> >> >> ---
>> >> >>   net/core/filter.c | 2 +-
>> >> >>   1 file changed, 1 insertion(+), 1 deletion(-)
>> >> >>
>> >> >> diff --git a/net/core/filter.c b/net/core/filter.c
>> >> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> >> >> --- a/net/core/filter.c
>> >> >> +++ b/net/core/filter.c
>> >> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>> >> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
>> >> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
>> >> >>   {
>> >> >> -       return __task_get_classid(current);
>> >> >> +       return task_cls_classid(current);
>> >> >>   }
>> >> >
>> >> > Daniel added this helper in
>> >> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>> >> > with intention to use it from networking hooks.
>> >> >
>> >> > But task_cls_classid() has
>> >> >          if (in_interrupt())
>> >> >                  return 0;
>> >> >
>> >> > which will trigger in softirq and tc hooks.
>> >> > So this might break Daniel's use case.
>> >>
>> >> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>> >> a new helper implementation for the more generic, non-networking case which
>> >> then internally uses task_cls_classid().
>> >
>> > Instead of forking the helper I think we can :
>> > rcu_read_lock_bh_held() || rcu_read_lock_held()
>> > in task_cls_state().
>>
>> I tested your suggestion with,
>>
>>   rcu_read_lock_bh_held() || rcu_read_lock_held()
>>
>> but it still triggers the RCU warning because BPF syscall programs use
>> rcu_read_lock_trace().
>>
>> Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
>> checkpatch warning:
>>
>>   WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code
>
> It's safe to ignore checkpatch in this case.

If that is the case I'll move forward with this. It was my initial fix
for this[1] anyway, but checkpatch made me change it.

[1]: https://github.com/charmitro/linux/commit/e5c42d49bfb967c3c35f536971f397492d2f46bf

>
>> I think the best solution here would be to add
>> local_bh_disable()/enable() protection directly in the BPF helper. This
>> keeps the fix localized to where the problem exists, and avoids
>> modifying core cgroup RCU.
>
> That works, but it will add runtime overhead.
> I doubt the helper is in a critical path, so I don't mind.
Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
Posted by Daniel Borkmann 6 months, 1 week ago
On 6/10/25 5:51 PM, Charalampos Mitrodimas wrote:
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> On Tue, Jun 10, 2025 at 8:23 AM Charalampos Mitrodimas
>> <charmitro@posteo.net> wrote:
>>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>>> On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>>>>>> On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>>>>>> <charmitro@posteo.net> wrote:
>>>>>>>
>>>>>>> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>>>>>>> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>>>>>>> program types.  This helper used __task_get_classid() which calls
>>>>>>> task_cls_state() that requires rcu_read_lock_bh_held().
>>>>>>>
>>>>>>> This triggers an RCU warning when called from BPF syscall programs
>>>>>>> which run under rcu_read_lock_trace():
>>>>>>>
>>>>>>>     WARNING: suspicious RCU usage
>>>>>>>     6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>>>>>>>     -----------------------------
>>>>>>>     net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>>>>>>>
>>>>>>> Fix this by replacing __task_get_classid() with task_cls_classid()
>>>>>>> which handles RCU locking internally using regular rcu_read_lock() and
>>>>>>> is safe to call from any context.
>>>>>>>
>>>>>>> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>>>>>>> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>>>>>>> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>>>>>>> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>>>>>>> ---
>>>>>>>    net/core/filter.c | 2 +-
>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>>>>> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>>>>>>> --- a/net/core/filter.c
>>>>>>> +++ b/net/core/filter.c
>>>>>>> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>>>>>>>    #ifdef CONFIG_CGROUP_NET_CLASSID
>>>>>>>    BPF_CALL_0(bpf_get_cgroup_classid_curr)
>>>>>>>    {
>>>>>>> -       return __task_get_classid(current);
>>>>>>> +       return task_cls_classid(current);
>>>>>>>    }
>>>>>>
>>>>>> Daniel added this helper in
>>>>>> commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>>>>>> with intention to use it from networking hooks.
>>>>>>
>>>>>> But task_cls_classid() has
>>>>>>           if (in_interrupt())
>>>>>>                   return 0;
>>>>>>
>>>>>> which will trigger in softirq and tc hooks.
>>>>>> So this might break Daniel's use case.
>>>>>
>>>>> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>>>>> a new helper implementation for the more generic, non-networking case which
>>>>> then internally uses task_cls_classid().
>>>>
>>>> Instead of forking the helper I think we can :
>>>> rcu_read_lock_bh_held() || rcu_read_lock_held()
>>>> in task_cls_state().
>>>
>>> I tested your suggestion with,
>>>
>>>    rcu_read_lock_bh_held() || rcu_read_lock_held()
>>>
>>> but it still triggers the RCU warning because BPF syscall programs use
>>> rcu_read_lock_trace().
>>>
>>> Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
>>> checkpatch warning:
>>>
>>>    WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code
>>
>> It's safe to ignore checkpatch in this case.
> 
> If that is the case I'll move forward with this. It was my initial fix
> for this[1] anyway, but checkpatch made me change it.

Agree that one is better!

> [1]: https://github.com/charmitro/linux/commit/e5c42d49bfb967c3c35f536971f397492d2f46bf

Thanks,
Daniel