From nobody Sun Feb 8 00:26:30 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62604331239 for ; Wed, 14 Jan 2026 18:01:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768413693; cv=none; b=KRxdad7rxMhTXZvO8ExpJmHQcPBbLl4V6+C61ZzvPPdglxy0/f8tDZsRq7AVnzx1rwAffeGWiowTMX7ekkCeQ5hq/4PKY5WpKVgSUs6ba/QvwoLmEHSCE1bVw25/9dJ+t2x+1+l62Em4euNwddcJNoct0MA7I4EOvOugsdC87rg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768413693; c=relaxed/simple; bh=irqxyvmTthSvtfb0fEx4crdVgQGPfvs1avrVC0bdRGo=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=jWkl9Qb42uhQ7iguDrqtlP8z56eeMnkoU3wdWkdYs1OsrNmBpKQjJFWv0eBbf74z342CqEwFGBx6jpunKxgN8UgiCrQalDRnE/izhC5ZNp1jr/8m1ZXN/cK+D6NKP2tBJgo0beLU7HpZRoPsrtiTWXU0/bnygXd/5sWq7QFvHVg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QNgERWZ2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QNgERWZ2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A0714C4CEF7; Wed, 14 Jan 2026 18:01:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1768413692; bh=irqxyvmTthSvtfb0fEx4crdVgQGPfvs1avrVC0bdRGo=; h=From:To:Cc:Subject:Date:From; b=QNgERWZ234xDofmEKRiT5Q6oCNoCymFccdL5486+cPMa2eKNhEK89qL/pPgskEC+B 521YE88qz/nljpwV5S6fULtKdAcAPUeRLMyPb231BKhQbt9VUJjjTSxRfkaQb2EQ2W dOOVSQLs3KgnUDp+5dhwGNxGAD0NyHYLnl5QNJrNiOfynmV2EtKVlToOmmYHVIJxug 74MaCNZ2INaNpNv3mV5r5ie8I8dDCg9/ymv/p0+4nVbMIgkrUG70gkcyV9Ms0Ozm1p +NkkEu0V3zNmGfg4A16m4VzZ41k1+UexWGj60B1d5xZ0s0OmgO9O5MJe3D/EJMznLF 1DaQ/ZN5bQ2kQ== From: Namhyung Kim To: Peter Zijlstra , Ingo Molnar Cc: Mark Rutland , Alexander Shishkin , Arnaldo Carvalho de Melo , LKML , Rosalie Fang Subject: [PATCH v2] perf/core: Fix slow perf_event_task_exit() with LBR callstacks Date: Wed, 14 Jan 2026 10:01:30 -0800 Message-ID: <20260114180130.133766-1-namhyung@kernel.org> X-Mailer: git-send-email 2.52.0.457.g6b5491de43-goog Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" I got a report that a task is stuck in perf_event_exit_task() waiting for global_ctx_data_rwsem. On large systems with lots threads, it'd have performance issues when it grabs the lock to iterate all threads in the system to allocate the context data. And it'd block task exit path which is problematic especially under memory pressure. perf_event_open perf_event_alloc attach_perf_ctx_data attach_global_ctx_data percpu_down_write (global_ctx_data_rwsem) for_each_process_thread alloc_task_ctx_data do_exit perf_event_exit_task percpu_down_read (global= _ctx_data_rwsem) It should not hold the global_ctx_data_rwsem on the exit path. Let's skip allocation for exiting tasks and free the data carefully. Reported-by: Rosalie Fang Suggested-by: Peter Zijlstra Signed-off-by: Namhyung Kim --- kernel/events/core.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 376fb07d869b8b50..b164e884102323f5 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5424,6 +5424,17 @@ attach_task_ctx_data(struct task_struct *task, struc= t kmem_cache *ctx_cache, if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)= ) { if (old) perf_free_ctx_data_rcu(old); + /* + * Above try_cmpxchg() pairs with try_cmpxchg() from + * detach_task_ctx_data() such that + * if we race with perf_event_exit_task(), we must + * observe PF_EXITING. + */ + if (task->flags & PF_EXITING) { + /* detach_task_ctx_data() may free it already */ + if (try_cmpxchg(&task->perf_ctx_data, &cd, NULL)) + perf_free_ctx_data_rcu(cd); + } return 0; } =20 @@ -5469,6 +5480,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache) /* Allocate everything */ scoped_guard (rcu) { for_each_process_thread(g, p) { + if (p->flags & PF_EXITING) + continue; cd =3D rcu_dereference(p->perf_ctx_data); if (cd && !cd->global) { cd->global =3D 1; @@ -14562,8 +14575,11 @@ void perf_event_exit_task(struct task_struct *task) =20 /* * Detach the perf_ctx_data for the system-wide event. + * + * Done without holding global_ctx_data_rwsem; typically + * attach_global_ctx_data() will skip over this task, but otherwise + * attach_task_ctx_data() will observe PF_EXITING. */ - guard(percpu_read)(&global_ctx_data_rwsem); detach_task_ctx_data(task); } =20 --=20 2.52.0.457.g6b5491de43-goog