[PATCH v2 1/2] perf/core: Fix nr_cgroups/cpuctx->cgrp is not updated correctly

Luo Gengkun posted 2 patches 6 months, 2 weeks ago
[PATCH v2 1/2] perf/core: Fix nr_cgroups/cpuctx->cgrp is not updated correctly
Posted by Luo Gengkun 6 months, 2 weeks ago
Since __perf_remove_from_context updates event->state before
list_del_event, this prevents list_del_event from calling
perf_cgroup_event_disable, resulting in will not update nr_cgroups and
cpuctx->cgrp.

To fix this problem, move perf_cgroup_event_disable into
__perf_remove_from_context as:

commit a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")

did.

Fixes: a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
---
 kernel/events/core.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f34c99f8ce8f..280d42b40b34 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2120,18 +2120,6 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (event->group_leader == event)
 		del_event_from_groups(event, ctx);
 
-	/*
-	 * If event was in error state, then keep it
-	 * that way, otherwise bogus counts will be
-	 * returned on read(). The only way to get out
-	 * of error state is by explicit re-enabling
-	 * of the event
-	 */
-	if (event->state > PERF_EVENT_STATE_OFF) {
-		perf_cgroup_event_disable(event, ctx);
-		perf_event_set_state(event, PERF_EVENT_STATE_OFF);
-	}
-
 	ctx->generation++;
 	event->pmu_ctx->nr_events--;
 }
@@ -2498,6 +2486,9 @@ __perf_remove_from_context(struct perf_event *event,
 		state = PERF_EVENT_STATE_DEAD;
 	}
 	event_sched_out(event, ctx);
+
+	if (event->state > PERF_EVENT_STATE_OFF)
+		perf_cgroup_event_disable(event, ctx);
 	perf_event_set_state(event, min(event->state, state));
 
 	if (flags & DETACH_GROUP)
-- 
2.34.1
Re: [PATCH v2 1/2] perf/core: Fix nr_cgroups/cpuctx->cgrp is not updated correctly
Posted by Peter Zijlstra 6 months, 2 weeks ago
On Wed, Jun 04, 2025 at 03:39:23AM +0000, Luo Gengkun wrote:
> Since __perf_remove_from_context updates event->state before
> list_del_event, this prevents list_del_event from calling
> perf_cgroup_event_disable, resulting in will not update nr_cgroups and
> cpuctx->cgrp.
> 
> To fix this problem, move perf_cgroup_event_disable into
> __perf_remove_from_context as:
> 
> commit a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")
> 
> did.
> 
> Fixes: a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")
> Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>

So I already have Yeoreum's patch for this:

  https://lkml.kernel.org/r/20250603144414.GC38114@noisy.programming.kicks-ass.net
Re: [PATCH v2 1/2] perf/core: Fix nr_cgroups/cpuctx->cgrp is not updated correctly
Posted by Luo Gengkun 6 months, 2 weeks ago
On 2025/6/4 17:19, Peter Zijlstra wrote:
> On Wed, Jun 04, 2025 at 03:39:23AM +0000, Luo Gengkun wrote:
>> Since __perf_remove_from_context updates event->state before
>> list_del_event, this prevents list_del_event from calling
>> perf_cgroup_event_disable, resulting in will not update nr_cgroups and
>> cpuctx->cgrp.
>>
>> To fix this problem, move perf_cgroup_event_disable into
>> __perf_remove_from_context as:
>>
>> commit a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")
>>
>> did.
>>
>> Fixes: a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")
>> Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
> So I already have Yeoreum's patch for this:
>
>    https://lkml.kernel.org/r/20250603144414.GC38114@noisy.programming.kicks-ass.net
> Oh! okay. But the second patch can still be reviewd. Thanks.