From nobody Sun Feb 8 20:59:57 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A496F17C208 for ; Tue, 24 Dec 2024 19:38:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735069086; cv=none; b=AssNKUr3ViTTqY7d0tDnFt0lmhHwwxuQaUrjKTkgQLBwYQ2eAlr3Po1Ugaga/myj3HFaf2WMbhmxi3f/44OOj6IIwMe4xvVtkmxN6bNxlsRxNaLhFqaL2U9qkvqrWVzSBrcixzSsX0j4qarqZjX8YsteGCqU0f2k0kkhMBHenwg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735069086; c=relaxed/simple; bh=tGaRySrBMCP86biRL3R+Tx/uOt+USkFbjrTaQ51h82g=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=nMwlvtgrVivpNxCyZp1wfBBkMs0aagkmcB4XnOcBTlkBnwOZXDOjdYHaSm/t5dJP6dcBDtTI5qviaL8EJaDtwjT9d6S+8u754wHqlzTtkvDOE6n//GeV0dXdyM7tRlXJTmqrTuF4ddLloHch99wka/teQfOo+w0nH3ggpozmxLU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 87476C4CEDC; Tue, 24 Dec 2024 19:38:06 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98) (envelope-from ) id 1tQAk8-0000000EVFg-3WdT; Tue, 24 Dec 2024 14:39:00 -0500 Message-ID: <20241224193900.690648158@goodmis.org> User-Agent: quilt/0.68 Date: Tue, 24 Dec 2024 14:38:37 -0500 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton Subject: [for-next][PATCH 1/5] fgraph: Remove unnecessary disabling of interrupts and recursion References: <20241224193836.812390655@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Steven Rostedt The function graph tracer disables interrupts as well as prevents recursion via NMIs when recording the graph tracer code. There's no reason to do this today. That disabling goes back to 2008 when the function graph tracer was first introduced and recursion protection wasn't part of the code. Today, there's no reason to disable interrupts or prevent the code from recursing as the infrastructure can easily handle it. Before this change: ~# echo function_graph > /sys/kernel/tracing/current_tracer ~# perf stat -r 10 ./hackbench 10 Time: 4.240 Time: 4.236 Time: 4.106 Time: 4.014 Time: 4.314 Time: 3.830 Time: 4.063 Time: 4.323 Time: 3.763 Time: 3.727 Performance counter stats for '/work/c/hackbench 10' (10 runs): 33,937.20 msec task-clock # 7.008 CPUs ut= ilized ( +- 1.85% ) 18,220 context-switches # 536.874 /sec = ( +- 6.41% ) 624 cpu-migrations # 18.387 /sec = ( +- 9.07% ) 11,319 page-faults # 333.528 /sec = ( +- 1.97% ) 76,657,643,617 cycles # 2.259 GHz = ( +- 0.40% ) 141,403,302,768 instructions # 1.84 insn pe= r cycle ( +- 0.37% ) 25,518,463,888 branches # 751.932 M/sec = ( +- 0.35% ) 156,151,050 branch-misses # 0.61% of all = branches ( +- 0.63% ) 4.8423 +- 0.0892 seconds time elapsed ( +- 1.84% ) After this change: ~# echo function_graph > /sys/kernel/tracing/current_tracer ~# perf stat -r 10 ./hackbench 10 Time: 3.340 Time: 3.192 Time: 3.129 Time: 2.579 Time: 2.589 Time: 2.798 Time: 2.791 Time: 2.955 Time: 3.044 Time: 3.065 Performance counter stats for './hackbench 10' (10 runs): 24,416.30 msec task-clock # 6.996 CPUs ut= ilized ( +- 2.74% ) 16,764 context-switches # 686.590 /sec = ( +- 5.85% ) 469 cpu-migrations # 19.208 /sec = ( +- 6.14% ) 11,519 page-faults # 471.775 /sec = ( +- 1.92% ) 53,895,628,450 cycles # 2.207 GHz = ( +- 0.52% ) 105,552,664,638 instructions # 1.96 insn pe= r cycle ( +- 0.47% ) 17,808,672,667 branches # 729.376 M/sec = ( +- 0.48% ) 133,075,435 branch-misses # 0.75% of all = branches ( +- 0.59% ) 3.490 +- 0.112 seconds time elapsed ( +- 3.22% ) Also removed unneeded "unlikely()" around the retaddr code. Cc: Masami Hiramatsu Cc: Mark Rutland Cc: Mathieu Desnoyers Cc: Andrew Morton Link: https://lore.kernel.org/20241223184941.204074053@goodmis.org Fixes: 9cd2992f2d6c8 ("fgraph: Have set_graph_notrace only affect function_= graph tracer") # Performance only Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace_functions_graph.c | 37 +++++++++++----------------- 1 file changed, 15 insertions(+), 22 deletions(-) diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_func= tions_graph.c index 5504b5e4e7b4..f513603d7df9 100644 --- a/kernel/trace/trace_functions_graph.c +++ b/kernel/trace/trace_functions_graph.c @@ -181,10 +181,9 @@ int trace_graph_entry(struct ftrace_graph_ent *trace, struct trace_array *tr =3D gops->private; struct trace_array_cpu *data; struct fgraph_times *ftimes; - unsigned long flags; unsigned int trace_ctx; long disabled; - int ret; + int ret =3D 0; int cpu; =20 if (*task_var & TRACE_GRAPH_NOTRACE) @@ -235,25 +234,21 @@ int trace_graph_entry(struct ftrace_graph_ent *trace, if (tracing_thresh) return 1; =20 - local_irq_save(flags); + preempt_disable_notrace(); cpu =3D raw_smp_processor_id(); data =3D per_cpu_ptr(tr->array_buffer.data, cpu); - disabled =3D atomic_inc_return(&data->disabled); - if (likely(disabled =3D=3D 1)) { - trace_ctx =3D tracing_gen_ctx_flags(flags); - if (unlikely(IS_ENABLED(CONFIG_FUNCTION_GRAPH_RETADDR) && - tracer_flags_is_set(TRACE_GRAPH_PRINT_RETADDR))) { + disabled =3D atomic_read(&data->disabled); + if (likely(!disabled)) { + trace_ctx =3D tracing_gen_ctx(); + if (IS_ENABLED(CONFIG_FUNCTION_GRAPH_RETADDR) && + tracer_flags_is_set(TRACE_GRAPH_PRINT_RETADDR)) { unsigned long retaddr =3D ftrace_graph_top_ret_addr(current); - ret =3D __trace_graph_retaddr_entry(tr, trace, trace_ctx, retaddr); - } else + } else { ret =3D __trace_graph_entry(tr, trace, trace_ctx); - } else { - ret =3D 0; + } } - - atomic_dec(&data->disabled); - local_irq_restore(flags); + preempt_enable_notrace(); =20 return ret; } @@ -320,7 +315,6 @@ void trace_graph_return(struct ftrace_graph_ret *trace, struct trace_array *tr =3D gops->private; struct trace_array_cpu *data; struct fgraph_times *ftimes; - unsigned long flags; unsigned int trace_ctx; long disabled; int size; @@ -341,16 +335,15 @@ void trace_graph_return(struct ftrace_graph_ret *trac= e, =20 trace->calltime =3D ftimes->calltime; =20 - local_irq_save(flags); + preempt_disable_notrace(); cpu =3D raw_smp_processor_id(); data =3D per_cpu_ptr(tr->array_buffer.data, cpu); - disabled =3D atomic_inc_return(&data->disabled); - if (likely(disabled =3D=3D 1)) { - trace_ctx =3D tracing_gen_ctx_flags(flags); + disabled =3D atomic_read(&data->disabled); + if (likely(!disabled)) { + trace_ctx =3D tracing_gen_ctx(); __trace_graph_return(tr, trace, trace_ctx); } - atomic_dec(&data->disabled); - local_irq_restore(flags); + preempt_enable_notrace(); } =20 static void trace_graph_thresh_return(struct ftrace_graph_ret *trace, --=20 2.45.2