kernel/cgroup/rstat.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
Cuurently the rstat update side is lockless and transfers the css of
cgroup whose stats has been updated through lockless list (llist). There
is an expected race where rstat updater skips adding css to the llist
because it was already in the list but the flusher might not see those
updates done by the skipped updater.
Usually the subsequent updater will take care of such situation but what
if the skipped updater was the last updater before the cgroup is removed
by the user. In that case stat updates by the skipped updater will be
lost. To avoid that let's always flush the stats of the offlined cgroup.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Fixes: 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe")
---
kernel/cgroup/rstat.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index a198e40c799b..91b34ebd5370 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -283,6 +283,16 @@ static struct cgroup_subsys_state *css_rstat_updated_list(
css_process_update_tree(root->ss, cpu);
+ /*
+ * We allow race between rstat updater and flusher which can cause a
+ * scenario where the updater skips adding the css to the list but the
+ * flusher might not see updater's updates. Usually the subsequent
+ * updater would take care of that but what if that was the last updater
+ * on that CPU before getting removed. Handle that scenario here.
+ */
+ if (!css_is_online(root))
+ __css_process_update_tree(root, cpu);
+
/* Return NULL if this subtree is not on-list */
if (!rstatc->updated_next)
return NULL;
--
2.47.3
Hi Shakeel.
On Thu, Dec 04, 2025 at 01:06:00PM -0800, Shakeel Butt <shakeel.butt@linux.dev> wrote:
> Cuurently the rstat update side is lockless and transfers the css of
> cgroup whose stats has been updated through lockless list (llist). There
> is an expected race where rstat updater skips adding css to the llist
> because it was already in the list but the flusher might not see those
> updates done by the skipped updater.
Notice that there's css_rstat_flush() in
css_free_rwork_fn()/css_rstat_exit().
> Usually the subsequent updater will take care of such situation but what
> if the skipped updater was the last updater before the cgroup is removed
> by the user. In that case stat updates by the skipped updater will be
> lost. To avoid that let's always flush the stats of the offlined cgroup.
Are you sure here that this is the different cause of the loss than the
other with unlocked cmpxchg you posted later?
> @@ -283,6 +283,16 @@ static struct cgroup_subsys_state *css_rstat_updated_list(
>
> css_process_update_tree(root->ss, cpu);
>
> + /*
> + * We allow race between rstat updater and flusher which can cause a
> + * scenario where the updater skips adding the css to the list but the
> + * flusher might not see updater's updates. Usually the subsequent
> + * updater would take care of that but what if that was the last updater
> + * on that CPU before getting removed. Handle that scenario here.
> + */
> + if (!css_is_online(root))
> + __css_process_update_tree(root, cpu);
> +
I'm thinking about this approach:
@@ -482,6 +484,15 @@ void css_rstat_exit(struct cgroup_subsys_state *css)
if (!css->rstat_cpu)
return;
+ /*
+ * We allow race between rstat updater and flusher which can cause a
+ * scenario where the updater skips adding the css to the list but the
+ * flusher might not see updater's updates. Usually the subsequent
+ * updater would take care of that but what if that was the last updater
+ * on that CPU before getting removed. Handle that scenario here.
+ */
+ for_each_possible_cpu(cpu)
+ css_rstat_updated(css, cpu);
css_rstat_flush(css);
/* sanity check */
because that moves the special treating from relatively commonn
css_rstat_updated_list() to only cgroup_exit().
(I didn't check this wouldn't break anything.)
Thanks,
Michal
On Mon, Dec 08, 2025 at 07:35:30PM +0100, Michal Koutný wrote: > Hi Shakeel. > > On Thu, Dec 04, 2025 at 01:06:00PM -0800, Shakeel Butt <shakeel.butt@linux.dev> wrote: > > Cuurently the rstat update side is lockless and transfers the css of > > cgroup whose stats has been updated through lockless list (llist). There > > is an expected race where rstat updater skips adding css to the llist > > because it was already in the list but the flusher might not see those > > updates done by the skipped updater. > > Notice that there's css_rstat_flush() in > css_free_rwork_fn()/css_rstat_exit(). > > > Usually the subsequent updater will take care of such situation but what > > if the skipped updater was the last updater before the cgroup is removed > > by the user. In that case stat updates by the skipped updater will be > > lost. To avoid that let's always flush the stats of the offlined cgroup. > > Are you sure here that this is the different cause of the loss than the > other with unlocked cmpxchg you posted later? > I didn't see any stats loss due to this specific case but I found this on code inspection while debugging the other issue. > > @@ -283,6 +283,16 @@ static struct cgroup_subsys_state *css_rstat_updated_list( > > > > css_process_update_tree(root->ss, cpu); > > > > + /* > > + * We allow race between rstat updater and flusher which can cause a > > + * scenario where the updater skips adding the css to the list but the > > + * flusher might not see updater's updates. Usually the subsequent > > + * updater would take care of that but what if that was the last updater > > + * on that CPU before getting removed. Handle that scenario here. > > + */ > > + if (!css_is_online(root)) > > + __css_process_update_tree(root, cpu); > > + > > I'm thinking about this approach: > > @@ -482,6 +484,15 @@ void css_rstat_exit(struct cgroup_subsys_state *css) > if (!css->rstat_cpu) > return; > > + /* > + * We allow race between rstat updater and flusher which can cause a > + * scenario where the updater skips adding the css to the list but the > + * flusher might not see updater's updates. Usually the subsequent > + * updater would take care of that but what if that was the last updater > + * on that CPU before getting removed. Handle that scenario here. > + */ > + for_each_possible_cpu(cpu) > + css_rstat_updated(css, cpu); > css_rstat_flush(css); > > /* sanity check */ > > because that moves the special treating from relatively commonn > css_rstat_updated_list() to only cgroup_exit(). > > (I didn't check this wouldn't break anything.) Yes I think this is much better. We just need to disable preemption for the assert within css_rstat_updated().
© 2016 - 2025 Red Hat, Inc.