[PATCH v2] cgroup: Add preemption protection to css_rstat_updated()

Jiayuan Chen posted 1 patch 1 week, 3 days ago
kernel/cgroup/rstat.c | 39 +++++++++++++++++++++++----------------
1 file changed, 23 insertions(+), 16 deletions(-)
[PATCH v2] cgroup: Add preemption protection to css_rstat_updated()
Posted by Jiayuan Chen 1 week, 3 days ago
BPF programs do not disable preemption, they only disable migration.
Therefore, when running the cgroup_hierarchical_stats selftest, a
warning [1] is generated.

The css_rstat_updated() function is lockless and reentrant. However,
as Tejun pointed out [2], preemption-related considerations need to
be considered. Since css_rstat_updated() can be called from BPF where
preemption is not disabled by its framework and it has already been
exposed as a kfunc to BPF programs, introducing a new kfunc like bpf_xx
will break existing uses. Thus, we directly make css_rstat_updated()
preempt-safe here.

[1]:
~/tools/testing/selftests/bpf$
test_progs -a cgroup_hierarchical_stats

...
------------[ cut here ]------------
WARNING: CPU: 0 PID: 382 at kernel/cgroup/rstat.c:84
Modules linked in:
RIP: 0010:css_rstat_updated+0x9d/0x160
...
PKRU: 55555554
Call Trace:
 <TASK>
 bpf_prog_16a1c2d081688506_counter+0x143/0x14e
 bpf_trampoline_6442524909+0x4b/0xb7
 cgroup_attach_task+0x5/0x330
 ? __cgroup_procs_write+0x1d7/0x2f0
 cgroup_procs_write+0x17/0x30
 cgroup_file_write+0xa6/0x2d0
 kernfs_fop_write_iter+0x188/0x240
 vfs_write+0x2da/0x5a0
 ksys_write+0x77/0x100
 __x64_sys_write+0x19/0x30
 x64_sys_call+0x79/0x26a0
 do_syscall_64+0x89/0x790
 ? irqentry_exit+0x77/0xb0
 ? __this_cpu_preempt_check+0x13/0x20
 ? lockdep_hardirqs_on+0xce/0x170
 ? irqentry_exit_to_user_mode+0xf2/0x290
 ? irqentry_exit+0x77/0xb0
 ? clear_bhb_loop+0x50/0xa0
 ? clear_bhb_loop+0x50/0xa0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
---[ end trace 0000000000000000 ]---

[2]: https://lore.kernel.org/cgroups/445b0d155b7a3cb84452aa7010669e293e8c37db@linux.dev/T/

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>

---
v1 -> v2: Add preemption protection instread of dropping preemption
          assert
v1: https://lore.kernel.org/cgroups/445b0d155b7a3cb84452aa7010669e293e8c37db@linux.dev/T/
---
 kernel/cgroup/rstat.c | 39 +++++++++++++++++++++++----------------
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index a198e40c799b..ec68c653545c 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -52,22 +52,7 @@ static inline struct llist_head *ss_lhead_cpu(struct cgroup_subsys *ss, int cpu)
 	return per_cpu_ptr(&rstat_backlog_list, cpu);
 }
 
-/**
- * css_rstat_updated - keep track of updated rstat_cpu
- * @css: target cgroup subsystem state
- * @cpu: cpu on which rstat_cpu was updated
- *
- * Atomically inserts the css in the ss's llist for the given cpu. This is
- * reentrant safe i.e. safe against softirq, hardirq and nmi. The ss's llist
- * will be processed at the flush time to create the update tree.
- *
- * NOTE: if the user needs the guarantee that the updater either add itself in
- * the lockless list or the concurrent flusher flushes its updated stats, a
- * memory barrier is needed before the call to css_rstat_updated() i.e. a
- * barrier after updating the per-cpu stats and before calling
- * css_rstat_updated().
- */
-__bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
+static void __css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
 {
 	struct llist_head *lhead;
 	struct css_rstat_cpu *rstatc;
@@ -122,6 +107,28 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
 	llist_add(&rstatc->lnode, lhead);
 }
 
+/**
+ * css_rstat_updated - keep track of updated rstat_cpu
+ * @css: target cgroup subsystem state
+ * @cpu: cpu on which rstat_cpu was updated
+ *
+ * Atomically inserts the css in the ss's llist for the given cpu. This is
+ * reentrant safe i.e. safe against softirq, hardirq and nmi. The ss's llist
+ * will be processed at the flush time to create the update tree.
+ *
+ * NOTE: if the user needs the guarantee that the updater either add itself in
+ * the lockless list or the concurrent flusher flushes its updated stats, a
+ * memory barrier is needed before the call to css_rstat_updated() i.e. a
+ * barrier after updating the per-cpu stats and before calling
+ * css_rstat_updated().
+ */
+__bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
+{
+	preempt_disable();
+	__css_rstat_updated(css, cpu);
+	preempt_enable();
+}
+
 static void __css_process_update_tree(struct cgroup_subsys_state *css, int cpu)
 {
 	/* put @css and all ancestors on the corresponding updated lists */
-- 
2.43.0
Re: [PATCH v2] cgroup: Add preemption protection to css_rstat_updated()
Posted by Waiman Long 1 week, 3 days ago
On 11/20/25 11:06 PM, Jiayuan Chen wrote:
> BPF programs do not disable preemption, they only disable migration.
> Therefore, when running the cgroup_hierarchical_stats selftest, a
> warning [1] is generated.
>
> The css_rstat_updated() function is lockless and reentrant. However,
> as Tejun pointed out [2], preemption-related considerations need to
> be considered. Since css_rstat_updated() can be called from BPF where
> preemption is not disabled by its framework and it has already been
> exposed as a kfunc to BPF programs, introducing a new kfunc like bpf_xx
> will break existing uses. Thus, we directly make css_rstat_updated()
> preempt-safe here.

My understand of Tejun's comment is to add bpf_preempt_disable() and 
bpf_preempt_enable() calls around the css_rstat_updated() call in the 
bpf program defined in 
tools/testing/selftests/bpf/prog_tests/cgroup_hierarchical_stats.c 
instead of adding that in the css_rstat_updated() function itself. But I 
may be wrong.

Cheers, Longman
Re: [PATCH v2] cgroup: Add preemption protection to css_rstat_updated()
Posted by Jiayuan Chen 1 week, 3 days ago
November 21, 2025 at 13:07, "Waiman Long" <llong@redhat.com mailto:llong@redhat.com?to=%22Waiman%20Long%22%20%3Cllong%40redhat.com%3E > wrote:


> 
> On 11/20/25 11:06 PM, Jiayuan Chen wrote:
> 
> > 
> > BPF programs do not disable preemption, they only disable migration.
> >  Therefore, when running the cgroup_hierarchical_stats selftest, a
> >  warning [1] is generated.
> > 
> >  The css_rstat_updated() function is lockless and reentrant. However,
> >  as Tejun pointed out [2], preemption-related considerations need to
> >  be considered. Since css_rstat_updated() can be called from BPF where
> >  preemption is not disabled by its framework and it has already been
> >  exposed as a kfunc to BPF programs, introducing a new kfunc like bpf_xx
> >  will break existing uses. Thus, we directly make css_rstat_updated()
> >  preempt-safe here.
> > 
> My understand of Tejun's comment is to add bpf_preempt_disable() and bpf_preempt_enable() calls around the css_rstat_updated() call in the bpf program defined in tools/testing/selftests/bpf/prog_tests/cgroup_hierarchical_stats.c instead of adding that in the css_rstat_updated() function itself. But I may be wrong.
> 
> Cheers, Longman
>

If that's really the case, then I'd rather add a new wrapper kfunc for BPF
to replace css_rstat_updated(). Otherwise, whether it gets triggered would
depend entirely on users behavior.

Right now, this WARNING is showing up in all BPF selftests. Although it's not
treated as an error that fails the tests,it's visible in the action runs:
https://github.com/kernel-patches/bpf/actions
Re: [PATCH v2] cgroup: Add preemption protection to css_rstat_updated()
Posted by Waiman Long 1 week, 3 days ago
On 11/21/25 1:21 AM, Jiayuan Chen wrote:
> November 21, 2025 at 13:07, "Waiman Long" <llong@redhat.com mailto:llong@redhat.com?to=%22Waiman%20Long%22%20%3Cllong%40redhat.com%3E > wrote:
>
>
>> On 11/20/25 11:06 PM, Jiayuan Chen wrote:
>>
>>> BPF programs do not disable preemption, they only disable migration.
>>>   Therefore, when running the cgroup_hierarchical_stats selftest, a
>>>   warning [1] is generated.
>>>
>>>   The css_rstat_updated() function is lockless and reentrant. However,
>>>   as Tejun pointed out [2], preemption-related considerations need to
>>>   be considered. Since css_rstat_updated() can be called from BPF where
>>>   preemption is not disabled by its framework and it has already been
>>>   exposed as a kfunc to BPF programs, introducing a new kfunc like bpf_xx
>>>   will break existing uses. Thus, we directly make css_rstat_updated()
>>>   preempt-safe here.
>>>
>> My understand of Tejun's comment is to add bpf_preempt_disable() and bpf_preempt_enable() calls around the css_rstat_updated() call in the bpf program defined in tools/testing/selftests/bpf/prog_tests/cgroup_hierarchical_stats.c instead of adding that in the css_rstat_updated() function itself. But I may be wrong.
>>
>> Cheers, Longman
>>
> If that's really the case, then I'd rather add a new wrapper kfunc for BPF
> to replace css_rstat_updated(). Otherwise, whether it gets triggered would
> depend entirely on users behavior.
>
> Right now, this WARNING is showing up in all BPF selftests. Although it's not
> treated as an error that fails the tests,it's visible in the action runs:
> https://github.com/kernel-patches/bpf/actions

All the existing callers of css_rstat_updated() except the bpf selftest 
has preemption disabled. So it doesn't make sense to impose a cost 
(though small) on kernel code that are in production kernel in order to 
make a selftest pass with no change.

Cheers,
Longman

>
Re: [PATCH v2] cgroup: Add preemption protection to css_rstat_updated()
Posted by Shakeel Butt 1 week ago
On Fri, Nov 21, 2025 at 10:47:47AM -0500, Waiman Long wrote:
> 
> On 11/21/25 1:21 AM, Jiayuan Chen wrote:
> > November 21, 2025 at 13:07, "Waiman Long" <llong@redhat.com mailto:llong@redhat.com?to=%22Waiman%20Long%22%20%3Cllong%40redhat.com%3E > wrote:
> > 
> > 
> > > On 11/20/25 11:06 PM, Jiayuan Chen wrote:
> > > 
> > > > BPF programs do not disable preemption, they only disable migration.
> > > >   Therefore, when running the cgroup_hierarchical_stats selftest, a
> > > >   warning [1] is generated.
> > > > 
> > > >   The css_rstat_updated() function is lockless and reentrant. However,
> > > >   as Tejun pointed out [2], preemption-related considerations need to
> > > >   be considered. Since css_rstat_updated() can be called from BPF where
> > > >   preemption is not disabled by its framework and it has already been
> > > >   exposed as a kfunc to BPF programs, introducing a new kfunc like bpf_xx
> > > >   will break existing uses. Thus, we directly make css_rstat_updated()
> > > >   preempt-safe here.
> > > > 
> > > My understand of Tejun's comment is to add bpf_preempt_disable() and bpf_preempt_enable() calls around the css_rstat_updated() call in the bpf program defined in tools/testing/selftests/bpf/prog_tests/cgroup_hierarchical_stats.c instead of adding that in the css_rstat_updated() function itself. But I may be wrong.
> > > 
> > > Cheers, Longman
> > > 
> > If that's really the case, then I'd rather add a new wrapper kfunc for BPF
> > to replace css_rstat_updated(). Otherwise, whether it gets triggered would
> > depend entirely on users behavior.
> > 
> > Right now, this WARNING is showing up in all BPF selftests. Although it's not
> > treated as an error that fails the tests,it's visible in the action runs:
> > https://github.com/kernel-patches/bpf/actions
> 
> All the existing callers of css_rstat_updated() except the bpf selftest has
> preemption disabled. So it doesn't make sense to impose a cost (though
> small) on kernel code that are in production kernel in order to make a
> selftest pass with no change.

+1. Please fix the bpf selftests to disable preemption before calling
this function.

For the long term, why not make the bpf verifier fails the bpf programs
if they don't disable preemption before calling such functions?