From nobody Fri Dec 19 16:44:56 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B722917C99B; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; cv=none; b=cfmBYE3W2xDfkBMX4JyXOzc552s95YPyHhq1lwudnSYfIFBHl6eybgkumKNLur+lYyz4GdtGJ3kZiu2m5n1ow+T5Tb9EHXjzXHtBpmCHyPSE2sIkyU0Ygjn8TFZDR9qYrkoXjz2k3QQZTN1S7Nqf5uIPVmP/CYK+2XsbAxCZ2zE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; c=relaxed/simple; bh=+WPVg+WaOEDJGm5/PXSMmD8jN2HQFBqGoZ67+9/Ors4=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=FWLGNZOI9AB855xIf0N0UQ+r6TE/XFatpQJDRtCqzwNcwuI+AuBtiDHswYdD050nDPT7W+2rYaPqdODxlTIlCbPDNPtAKpfwQWwZZvrgwOiOm6tH3tC0IT2Q6g/fOm2X+lRFoLVANyfLbE7667m+bUZOPj/XQpei/fqmZ+uEVFc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4B27DC4CEC8; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98) (envelope-from ) id 1slzfn-00000005BnR-39rT; Wed, 04 Sep 2024 19:44:27 -0400 Message-ID: <20240904234427.612375392@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 04 Sep 2024 19:44:12 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Alexei Starovoitov , Florent Revest , Martin KaFai Lau , bpf , Sven Schnelle , Alexei Starovoitov , Jiri Olsa , Arnaldo Carvalho de Melo , Daniel Borkmann , Alan Maguire , Peter Zijlstra , Thomas Gleixner , Guo Ren Subject: [for-linus][PATCH 1/6] tracing: fgraph: Fix to add new fgraph_ops to array after ftrace_startup_subops() References: <20240904234411.443593140@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Masami Hiramatsu (Google)" Since the register_ftrace_graph() assigns a new fgraph_ops to fgraph_array before registring it by ftrace_startup_subops(), the new fgraph_ops can be used in function_graph_enter(). In most cases, it is still OK because those fgraph_ops's hashtable is already initialized by ftrace_set_filter*() etc. But if a user registers a new fgraph_ops which does not initialize the hash list, ftrace_ops_test() in function_graph_enter() causes a NULL pointer dereference BUG because fgraph_ops->ops.func_hash is NULL. This can be reproduced by the below commands because function profiler's fgraph_ops does not initialize the hash list; # cd /sys/kernel/tracing # echo function_graph > current_tracer # echo 1 > function_profile_enabled To fix this problem, add a new fgraph_ops to fgraph_array after ftrace_startup_subops(). Thus, until the new fgraph_ops is initialized, we will see fgraph_stub on the corresponding fgraph_array entry. Cc: Alexei Starovoitov Cc: Florent Revest Cc: Martin KaFai Lau Cc: bpf Cc: Sven Schnelle Cc: Alexei Starovoitov Cc: Jiri Olsa Cc: Arnaldo Carvalho de Melo Cc: Daniel Borkmann Cc: Alan Maguire Cc: Mark Rutland Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Guo Ren Link: https://lore.kernel.org/172398528350.293426.8347220120333730248.stgit= @devnote2 Fixes: c132be2c4fcc ("function_graph: Have the instances use their own ftra= ce_ops for filtering") Signed-off-by: Masami Hiramatsu (Google) Signed-off-by: Steven Rostedt (Google) --- kernel/trace/fgraph.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c index d1d5ea2d0a1b..d7d4fb403f6f 100644 --- a/kernel/trace/fgraph.c +++ b/kernel/trace/fgraph.c @@ -1206,18 +1206,24 @@ static void init_task_vars(int idx) read_unlock(&tasklist_lock); } =20 -static void ftrace_graph_enable_direct(bool enable_branch) +static void ftrace_graph_enable_direct(bool enable_branch, struct fgraph_o= ps *gops) { trace_func_graph_ent_t func =3D NULL; trace_func_graph_ret_t retfunc =3D NULL; int i; =20 - for_each_set_bit(i, &fgraph_array_bitmask, - sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) { - func =3D fgraph_array[i]->entryfunc; - retfunc =3D fgraph_array[i]->retfunc; - fgraph_direct_gops =3D fgraph_array[i]; - } + if (gops) { + func =3D gops->entryfunc; + retfunc =3D gops->retfunc; + fgraph_direct_gops =3D gops; + } else { + for_each_set_bit(i, &fgraph_array_bitmask, + sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) { + func =3D fgraph_array[i]->entryfunc; + retfunc =3D fgraph_array[i]->retfunc; + fgraph_direct_gops =3D fgraph_array[i]; + } + } if (WARN_ON_ONCE(!func)) return; =20 @@ -1256,8 +1262,6 @@ int register_ftrace_graph(struct fgraph_ops *gops) ret =3D -ENOSPC; goto out; } - - fgraph_array[i] =3D gops; gops->idx =3D i; =20 ftrace_graph_active++; @@ -1266,7 +1270,7 @@ int register_ftrace_graph(struct fgraph_ops *gops) ftrace_graph_disable_direct(true); =20 if (ftrace_graph_active =3D=3D 1) { - ftrace_graph_enable_direct(false); + ftrace_graph_enable_direct(false, gops); register_pm_notifier(&ftrace_suspend_notifier); ret =3D start_graph_tracing(); if (ret) @@ -1281,14 +1285,15 @@ int register_ftrace_graph(struct fgraph_ops *gops) } else { init_task_vars(gops->idx); } - /* Always save the function, and reset at unregistering */ gops->saved_func =3D gops->entryfunc; =20 ret =3D ftrace_startup_subops(&graph_ops, &gops->ops, command); + if (!ret) + fgraph_array[i] =3D gops; + error: if (ret) { - fgraph_array[i] =3D &fgraph_stub; ftrace_graph_active--; gops->saved_func =3D NULL; fgraph_lru_release_index(i); @@ -1324,7 +1329,7 @@ void unregister_ftrace_graph(struct fgraph_ops *gops) ftrace_shutdown_subops(&graph_ops, &gops->ops, command); =20 if (ftrace_graph_active =3D=3D 1) - ftrace_graph_enable_direct(true); + ftrace_graph_enable_direct(true, NULL); else if (!ftrace_graph_active) ftrace_graph_disable_direct(false); =20 --=20 2.43.0 From nobody Fri Dec 19 16:44:56 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B71BF17BEA7; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; cv=none; b=oFM9SWd6nLgbcMsQsKuaVL8z1+DlNtda4rv4fSUg3vR1KGExe/7hxhPHQ29a7Aos1zMWzNXaiG/lowznW3eYivg57108buOuu66xmhagp2gYz2vlA+0ihoUQY0NCXXCG68D7yFouljeExeU4N1glGdtOe6d55fKF28OXdCrDDyk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; c=relaxed/simple; bh=I8pfGr9tNLSa03+h5g9vvW/v1zZshm0EIxFsxc/oTFQ=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=TXmkvpob2qcF+Bx67moJTs7mBQsu65XPyke9Kvbp5b9cl0MojsdFIPoyuw7NtSiXMbUPbmzfhCvI+h2Ddy4Cij5DNsyEPWXnbeE3+et4srzxgzgrU7t7sLziz3VJfyMErNmthjHTdmHzkB0FBX8VT8xIKZqPXFs6XjEUNxTVIXk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4D4ABC4CEC9; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98) (envelope-from ) id 1slzfn-00000005Bnv-3nbT; Wed, 04 Sep 2024 19:44:27 -0400 Message-ID: <20240904234427.774268849@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 04 Sep 2024 19:44:13 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , stable@vger.kernel.org Subject: [for-linus][PATCH 2/6] tracing: Fix memory leak in fgraph storage selftest References: <20240904234411.443593140@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Masami Hiramatsu (Google)" With ftrace boot-time selftest, kmemleak reported some memory leaks in the new test case for function graph storage for multiple tracers. unreferenced object 0xffff888005060080 (size 32): comm "swapper/0", pid 1, jiffies 4294676440 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 20 10 06 05 80 88 ff ff ........ ....... 54 0c 1e 81 ff ff ff ff 00 00 00 00 00 00 00 00 T............... backtrace (crc 7c93416c): [<000000000238ee6f>] __kmalloc_cache_noprof+0x11f/0x2a0 [<0000000033d2b6c5>] enter_record+0xe8/0x150 [<0000000054c38424>] match_records+0x1cd/0x230 [<00000000c775b63d>] ftrace_set_hash+0xff/0x380 [<000000007bf7208c>] ftrace_set_filter+0x70/0x90 [<00000000a5c08dda>] test_graph_storage_multi+0x2e/0xf0 [<000000006ba028ca>] trace_selftest_startup_function_graph+0x1e8/0x260 [<00000000a715d3eb>] run_tracer_selftest+0x111/0x190 [<00000000395cbf90>] register_tracer+0xdf/0x1f0 [<0000000093e67f7b>] do_one_initcall+0x141/0x3b0 [<00000000c591b682>] do_initcall_level+0x82/0xa0 [<000000004e4c6600>] do_initcalls+0x43/0x70 [<0000000034f3c4e4>] kernel_init_freeable+0x170/0x1f0 [<00000000c7a5dab2>] kernel_init+0x1a/0x1a0 [<00000000ea105947>] ret_from_fork+0x3a/0x50 [<00000000a1932e84>] ret_from_fork_asm+0x1a/0x30 ... This means filter hash allocated for the fixtures are not correctly released after the test. Free those hash lists after tests are done and split the loop for initialize fixture and register fixture for rollback. Fixes: dd120af2d5f8 ("ftrace: Add multiple fgraph storage selftest") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/172411539857.28895.13119957560263401102.stgit= @devnote2 Signed-off-by: Masami Hiramatsu (Google) Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace_selftest.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c index 97f1e4bc47dc..c4ad7cd7e778 100644 --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -942,7 +942,7 @@ static __init int test_graph_storage_multi(void) { struct fgraph_fixture *fixture; bool printed =3D false; - int i, ret; + int i, j, ret; =20 pr_cont("PASSED\n"); pr_info("Testing multiple fgraph storage on a function: "); @@ -953,22 +953,35 @@ static __init int test_graph_storage_multi(void) if (ret && ret !=3D -ENODEV) { pr_cont("*Could not set filter* "); printed =3D true; - goto out; + goto out2; } + } =20 + for (j =3D 0; j < ARRAY_SIZE(store_bytes); j++) { + fixture =3D &store_bytes[j]; ret =3D register_ftrace_graph(&fixture->gops); if (ret) { pr_warn("Failed to init store_bytes fgraph tracing\n"); printed =3D true; - goto out; + goto out1; } } =20 DYN_FTRACE_TEST_NAME(); -out: +out1: + while (--j >=3D 0) { + fixture =3D &store_bytes[j]; + unregister_ftrace_graph(&fixture->gops); + + if (fixture->error_str && !printed) { + pr_cont("*** %s ***", fixture->error_str); + printed =3D true; + } + } +out2: while (--i >=3D 0) { fixture =3D &store_bytes[i]; - unregister_ftrace_graph(&fixture->gops); + ftrace_free_filter(&fixture->gops.ops); =20 if (fixture->error_str && !printed) { pr_cont("*** %s ***", fixture->error_str); --=20 2.43.0 From nobody Fri Dec 19 16:44:56 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7126179958; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; cv=none; b=OtHwDFHRL58S1dIcVJleFLFKVY1BvzRrmhTk033Rt/wfraeV8Dov+/TjtYbqZr3+BGLiq5adeJGL8d1ffI5yB2wtm6BC+vJHGq1NGqU62bIGRZmhCBnW1q7n0DyWaE6DLxSvIFo8Gh63gcMjy4EbBh/2o+SuUZO5Jetr8Ut9Rt0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; c=relaxed/simple; bh=z1Hmdig6/sKfwwjYdO9D6QFTtvoQAzQOiSHpFgUeDn8=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=bUmXP+Hv68rRFqargBL1yu5t9e5OkiKZv7uvhx7tcvCOJgD/NnRAO20RMyQo6DpKVfcAmjrR3GYiM9MUXPQaLajk76izzZb2/n3A1oPoQdXjMVkgRJHFg/g6qGeGZlHmpVUTWIdArLL4LBHzX9OZpA/97GGScrvPC8l2nRlH2yg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 51857C4CEC3; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98) (envelope-from ) id 1slzfo-00000005BoO-0GyU; Wed, 04 Sep 2024 19:44:28 -0400 Message-ID: <20240904234427.927438809@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 04 Sep 2024 19:44:14 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , stable@vger.kernel.org, "Luis Claudio R. Goncalves" , Tomas Glozar Subject: [for-linus][PATCH 3/6] tracing/timerlat: Only clear timer if a kthread exists References: <20240904234411.443593140@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Steven Rostedt The timerlat tracer can use user space threads to check for osnoise and timer latency. If the program using this is killed via a SIGTERM, the threads are shutdown one at a time and another tracing instance can start up resetting the threads before they are fully closed. That causes the hrtimer assigned to the kthread to be shutdown and freed twice when the dying thread finally closes the file descriptors, causing a use-after-free bug. Only cancel the hrtimer if the associated thread is still around. Note, this is just a quick fix that can be backported to stable. A real fix is to have a better synchronization between the shutdown of old threads and the starting of new ones. Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.co= m/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Cc: "Luis Claudio R. Goncalves" Link: https://lore.kernel.org/20240903111642.35292e70@gandalf.local.home Fixes: e88ed227f639e ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace_osnoise.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c index 66a871553d4a..400a72cd6ab5 100644 --- a/kernel/trace/trace_osnoise.c +++ b/kernel/trace/trace_osnoise.c @@ -265,6 +265,8 @@ static inline void tlat_var_reset(void) */ for_each_cpu(cpu, cpu_online_mask) { tlat_var =3D per_cpu_ptr(&per_cpu_timerlat_var, cpu); + if (tlat_var->kthread) + hrtimer_cancel(&tlat_var->timer); memset(tlat_var, 0, sizeof(*tlat_var)); } } @@ -2579,7 +2581,8 @@ static int timerlat_fd_release(struct inode *inode, s= truct file *file) osn_var =3D per_cpu_ptr(&per_cpu_osnoise_var, cpu); tlat_var =3D per_cpu_ptr(&per_cpu_timerlat_var, cpu); =20 - hrtimer_cancel(&tlat_var->timer); + if (tlat_var->kthread) + hrtimer_cancel(&tlat_var->timer); memset(tlat_var, 0, sizeof(*tlat_var)); =20 osn_var->sampling =3D 0; --=20 2.43.0 From nobody Fri Dec 19 16:44:56 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF8A7179654; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; cv=none; b=Ew/tE8fIXyujgvajBxf4Dw43dEQwZvoqi5/NkiRUJus6Jxsi0pq4/bYzGIQgAx24WwYQJbF4OM6NTzgV2EC8DMa5+d4fndP+Fy2vQ99FrggajHjFDKZQNjn2+h36vIJ8vdsrxDEcLSDcQ5OVgPeIhjsIqmGURfrx9O7TGj+CBsA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493406; c=relaxed/simple; bh=PQwa/d3gkR58fMLM/4lJdk+KxwJdYOSRhOiYc7vavmA=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=ryCRBDHr40T1l3r7vuC5VhczFkyZDgUzHUV7Dc5YQrJN7G5/+O/f40PE0qR0aXenlywnToMGja3e4WBTQzAU7lfjz9etWUUsRu2bAwQhVlYabeDRzkhCeStMWfxP5/UFtZatZD8Ws8C8/jxENvWtP1V+z3kFqnG1KyxqXgK8Pl8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 90FF8C4CED2; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98) (envelope-from ) id 1slzfo-00000005Bov-0wLm; Wed, 04 Sep 2024 19:44:28 -0400 Message-ID: <20240904234428.085548777@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 04 Sep 2024 19:44:15 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , stable@vger.kernel.org, "Luis Claudio R. Goncalves" , Tomas Glozar Subject: [for-linus][PATCH 4/6] tracing/osnoise: Use a cpumask to know what threads are kthreads References: <20240904234411.443593140@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Steven Rostedt The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=3D$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffff= c0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002= -gbc754cc76d1b-dirty #125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.1= 6.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85= 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c = 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:00000000000000= 00 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_h= da_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.co= m/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/20240823013902.135036960@goodmis.org/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/20240823102816.5e55753b@gandal= f.local.home/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Cc: "Luis Claudio R. Goncalves" Link: https://lore.kernel.org/20240904103428.08efdf4c@gandalf.local.home Fixes: e88ed227f639e ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace_osnoise.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c index 400a72cd6ab5..8543d941b870 100644 --- a/kernel/trace/trace_osnoise.c +++ b/kernel/trace/trace_osnoise.c @@ -1614,6 +1614,7 @@ static int run_osnoise(void) =20 static struct cpumask osnoise_cpumask; static struct cpumask save_cpumask; +static struct cpumask kthread_cpumask; =20 /* * osnoise_sleep - sleep until the next period @@ -1677,6 +1678,7 @@ static inline int osnoise_migration_pending(void) */ mutex_lock(&interface_lock); this_cpu_osn_var()->kthread =3D NULL; + cpumask_clear_cpu(smp_processor_id(), &kthread_cpumask); mutex_unlock(&interface_lock); =20 return 1; @@ -1949,9 +1951,10 @@ static void stop_kthread(unsigned int cpu) =20 kthread =3D per_cpu(per_cpu_osnoise_var, cpu).kthread; if (kthread) { - if (test_bit(OSN_WORKLOAD, &osnoise_options)) { + if (cpumask_test_and_clear_cpu(cpu, &kthread_cpumask) && + !WARN_ON(!test_bit(OSN_WORKLOAD, &osnoise_options))) { kthread_stop(kthread); - } else { + } else if (!WARN_ON(test_bit(OSN_WORKLOAD, &osnoise_options))) { /* * This is a user thread waiting on the timerlat_fd. We need * to close all users, and the best way to guarantee this is @@ -2023,6 +2026,7 @@ static int start_kthread(unsigned int cpu) } =20 per_cpu(per_cpu_osnoise_var, cpu).kthread =3D kthread; + cpumask_set_cpu(cpu, &kthread_cpumask); =20 return 0; } @@ -2050,8 +2054,16 @@ static int start_per_cpu_kthreads(void) */ cpumask_and(current_mask, cpu_online_mask, &osnoise_cpumask); =20 - for_each_possible_cpu(cpu) + for_each_possible_cpu(cpu) { + if (cpumask_test_and_clear_cpu(cpu, &kthread_cpumask)) { + struct task_struct *kthread; + + kthread =3D per_cpu(per_cpu_osnoise_var, cpu).kthread; + if (!WARN_ON(!kthread)) + kthread_stop(kthread); + } per_cpu(per_cpu_osnoise_var, cpu).kthread =3D NULL; + } =20 for_each_cpu(cpu, current_mask) { retval =3D start_kthread(cpu); --=20 2.43.0 From nobody Fri Dec 19 16:44:56 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18FE0186608; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493407; cv=none; b=jYP7zVjzFIIf0Z0o7oZmjrbpf4SILSpcVOmrzbM+FYbFbkHii0xZ4VbP5QqRkUy4VaNDINhr9kqPbQhjETSIx4zuFdHeUOJ5xTuufIwOyxT83YUrxQWKSrWaPXHKm90J4o2bdRZL8benY54YkjBQAJvDFMG2OiZi1xOJYasxbII= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493407; c=relaxed/simple; bh=znbtzMLOJQ9Au5yW8iyqOnvVzKUgjfxf+yVoLIyt3rs=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=llKo+letmCaGILXiaT7cCOQN6vt6tg2/fex/PAYWJ9YNLLlK6yorb67C8Xl9X7BspZO5lghV62EHMKXq09AwGdF4T2seojl65vPGrLBcjptkcrNv3Ak8wMr/gRVUxPcK9Y3/2grpDJSSgXyPT+Kt1G73fuVvmVcVFZ/0y9Glw4w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id A122BC4CED5; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98) (envelope-from ) id 1slzfo-00000005BpP-1c3u; Wed, 04 Sep 2024 19:44:28 -0400 Message-ID: <20240904234428.246209582@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 04 Sep 2024 19:44:16 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , stable@vger.kernel.org, Zheng Yejian Subject: [for-linus][PATCH 5/6] tracing: Avoid possible softlockup in tracing_iter_reset() References: <20240904234411.443593140@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Zheng Yejian In __tracing_open(), when max latency tracers took place on the cpu, the time start of its buffer would be updated, then event entries with timestamps being earlier than start of the buffer would be skipped (see tracing_iter_reset()). Softlockup will occur if the kernel is non-preemptible and too many entries were skipped in the loop that reset every cpu buffer, so add cond_resched() to avoid it. Cc: stable@vger.kernel.org Fixes: 2f26ebd549b9a ("tracing: use timestamp to determine start of latency= traces") Link: https://lore.kernel.org/20240827124654.3817443-1-zhengyejian@huaweicl= oud.com Suggested-by: Steven Rostedt Signed-off-by: Zheng Yejian Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index ebe7ce2f5f4a..edf6bc817aa1 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3958,6 +3958,8 @@ void tracing_iter_reset(struct trace_iterator *iter, = int cpu) break; entries++; ring_buffer_iter_advance(buf_iter); + /* This could be a big loop */ + cond_resched(); } =20 per_cpu_ptr(iter->array_buffer->data, cpu)->skipped_entries =3D entries; --=20 2.43.0 From nobody Fri Dec 19 16:44:56 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56385188582; Wed, 4 Sep 2024 23:43:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493407; cv=none; b=fieDUcWcSV44gyTNkV/f0FOu5w1qtS5ttTBW+6RfhL4pkoSCe8+a/7nxdXIKlegYg3iv0vwc1bVSuBURjin/+TYW8yiDGdBiNDlg4N7Q1FAB7evqg+Lmk6t0NXX+QiJsfPhSw/W5Vyf7+/RH00/rw2tnrEWC5SyirnMGTkUA/D0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725493407; c=relaxed/simple; bh=HIysEs4RR0b2/m+X1+me+BZTI+S9WgRiIAAWQOLfrg8=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=oCWKZGWQtCwL0fM0ZxommTnAYSqhjwl4bB3ixbhnyv1DW1os4zBKsh93v4XimbuvC8r1e5IOmZ1mahxZjBfK9hYFd2ihKK6ZlgkmJjXRXooqbvVWqGtFcmMjW59Tf4xVfxDIWZRmo57w1cOEDJwHczrWpJxf4xeB+Zdw66hznZQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id E226BC4CED3; Wed, 4 Sep 2024 23:43:26 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98) (envelope-from ) id 1slzfo-00000005Bpt-2Ia8; Wed, 04 Sep 2024 19:44:28 -0400 Message-ID: <20240904234428.406523717@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 04 Sep 2024 19:44:17 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , stable@vger.kernel.org, Chi Zhiling Subject: [for-linus][PATCH 6/6] eventfs: Use list_del_rcu() for SRCU protected list variable References: <20240904234411.443593140@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Steven Rostedt Chi Zhiling reported: We found a null pointer accessing in tracefs[1], the reason is that the variable 'ei_child' is set to LIST_POISON1, that means the list was removed in eventfs_remove_rec. so when access the ei_child->is_freed, the panic triggered. by the way, the following script can reproduce this panic loop1 (){ while true do echo "p:kp submit_bio" > /sys/kernel/debug/tracing/kprobe_events echo "" > /sys/kernel/debug/tracing/kprobe_events done } loop2 (){ while true do tree /sys/kernel/debug/tracing/events/kprobes/ done } loop1 & loop2 [1]: [ 1147.959632][T17331] Unable to handle kernel paging request at virtual = address dead000000000150 [ 1147.968239][T17331] Mem abort info: [ 1147.971739][T17331] ESR =3D 0x0000000096000004 [ 1147.976172][T17331] EC =3D 0x25: DABT (current EL), IL =3D 32 bits [ 1147.982171][T17331] SET =3D 0, FnV =3D 0 [ 1147.985906][T17331] EA =3D 0, S1PTW =3D 0 [ 1147.989734][T17331] FSC =3D 0x04: level 0 translation fault [ 1147.995292][T17331] Data abort info: [ 1147.998858][T17331] ISV =3D 0, ISS =3D 0x00000004, ISS2 =3D 0x000000= 00 [ 1148.005023][T17331] CM =3D 0, WnR =3D 0, TnD =3D 0, TagAccess =3D 0 [ 1148.010759][T17331] GCS =3D 0, Overlay =3D 0, DirtyBit =3D 0, Xs =3D= 0 [ 1148.016752][T17331] [dead000000000150] address between user and kernel= address ranges [ 1148.024571][T17331] Internal error: Oops: 0000000096000004 [#1] SMP [ 1148.030825][T17331] Modules linked in: team_mode_loadbalance team nlmo= n act_gact cls_flower sch_ingress bonding tls macvlan dummy ib_core bridge = stp llc veth amdgpu amdxcp mfd_core gpu_sched drm_exec drm_buddy radeon crc= t10dif_ce video drm_suballoc_helper ghash_ce drm_ttm_helper sha2_ce ttm sha= 256_arm64 i2c_algo_bit sha1_ce sbsa_gwdt cp210x drm_display_helper cec sr_m= od cdrom drm_kms_helper binfmt_misc sg loop fuse drm dm_mod nfnetlink ip_ta= bles autofs4 [last unloaded: tls] [ 1148.072808][T17331] CPU: 3 PID: 17331 Comm: ls Tainted: G W = ------- ---- 6.6.43 #2 [ 1148.081751][T17331] Source Version: 21b3b386e948bedd29369af66f3e98ab01= b1c650 [ 1148.088783][T17331] Hardware name: Greatwall GW-001M1A-FTF/GW-001M1A-F= TF, BIOS KunLun BIOS V4.0 07/16/2020 [ 1148.098419][T17331] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -S= SBS BTYPE=3D--) [ 1148.106060][T17331] pc : eventfs_iterate+0x2c0/0x398 [ 1148.111017][T17331] lr : eventfs_iterate+0x2fc/0x398 [ 1148.115969][T17331] sp : ffff80008d56bbd0 [ 1148.119964][T17331] x29: ffff80008d56bbf0 x28: ffff001ff5be2600 x27: 0= 000000000000000 [ 1148.127781][T17331] x26: ffff001ff52ca4e0 x25: 0000000000009977 x24: d= ead000000000100 [ 1148.135598][T17331] x23: 0000000000000000 x22: 000000000000000b x21: f= fff800082645f10 [ 1148.143415][T17331] x20: ffff001fddf87c70 x19: ffff80008d56bc90 x18: 0= 000000000000000 [ 1148.151231][T17331] x17: 0000000000000000 x16: 0000000000000000 x15: f= fff001ff52ca4e0 [ 1148.159048][T17331] x14: 0000000000000000 x13: 0000000000000000 x12: 0= 000000000000000 [ 1148.166864][T17331] x11: 0000000000000000 x10: 0000000000000000 x9 : f= fff8000804391d0 [ 1148.174680][T17331] x8 : 0000000180000000 x7 : 0000000000000018 x6 : 0= 000aaab04b92862 [ 1148.182498][T17331] x5 : 0000aaab04b92862 x4 : 0000000080000000 x3 : 0= 000000000000068 [ 1148.190314][T17331] x2 : 000000000000000f x1 : 0000000000007ea8 x0 : 0= 000000000000001 [ 1148.198131][T17331] Call trace: [ 1148.201259][T17331] eventfs_iterate+0x2c0/0x398 [ 1148.205864][T17331] iterate_dir+0x98/0x188 [ 1148.210036][T17331] __arm64_sys_getdents64+0x78/0x160 [ 1148.215161][T17331] invoke_syscall+0x78/0x108 [ 1148.219593][T17331] el0_svc_common.constprop.0+0x48/0xf0 [ 1148.224977][T17331] do_el0_svc+0x24/0x38 [ 1148.228974][T17331] el0_svc+0x40/0x168 [ 1148.232798][T17331] el0t_64_sync_handler+0x120/0x130 [ 1148.237836][T17331] el0t_64_sync+0x1a4/0x1a8 [ 1148.242182][T17331] Code: 54ffff6c f9400676 910006d6 f9000676 (b940530= 0) [ 1148.248955][T17331] ---[ end trace 0000000000000000 ]--- The issue is that list_del() is used on an SRCU protected list variable before the synchronization occurs. This can poison the list pointers while there is a reader iterating the list. This is simply fixed by using list_del_rcu() that is specifically made for this purpose. Link: https://lore.kernel.org/linux-trace-kernel/20240829085025.3600021-1-c= hizhiling@163.com/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Link: https://lore.kernel.org/20240904131605.640d42b1@gandalf.local.home Fixes: 43aa6f97c2d03 ("eventfs: Get rid of dentry pointers without refcount= s") Reported-by: Chi Zhiling Tested-by: Chi Zhiling Signed-off-by: Steven Rostedt (Google) --- fs/tracefs/event_inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c index 01e99e98457d..8705c77a9e75 100644 --- a/fs/tracefs/event_inode.c +++ b/fs/tracefs/event_inode.c @@ -862,7 +862,7 @@ static void eventfs_remove_rec(struct eventfs_inode *ei= , int level) list_for_each_entry(ei_child, &ei->children, list) eventfs_remove_rec(ei_child, level + 1); =20 - list_del(&ei->list); + list_del_rcu(&ei->list); free_ei(ei); } =20 --=20 2.43.0