Documentation/trace/ftrace.rst | 6 ++++++ kernel/trace/trace.c | 1 + kernel/trace/trace_irqsoff.c | 14 ++++++++++++++ kernel/trace/trace_sched_wakeup.c | 14 ++++++++++++++ 4 files changed, 35 insertions(+)
Linus,
Tracing fixes for 6.13:
- Fix a regression in the irqsoff and wakeup latency tracing
The function graph tracer infrastructure has become generic so that
fprobes and BPF can be based on it. As it use to only handle function
graph tracing, it would always calculate the time the function entered
so that it could then calculate the time it exits and give the length of
time the function executed for. But this is not needed for the other
users (fprobes and BPF) and reading the clock adds a non-negligible
overhead, so the calculation was moved into the function graph tracer
logic.
But the irqsoff and wakeup latency tracers, when the "display-graph"
option was set, would use the function graph tracer to calculate the
times of functions during the latency. The movement of the calltime
calculation made the value zero for these tracers, and the output
no longer showed the length of time of each tracer, but instead the
absolute timestamp of when the function returned (rettime - calltime
where calltime is now zero).
Have the irqsoff and wakeup latency tracers also do the calltime
calculation as the function graph tracer does and report the proper
length of the function timings.
- Update the tracing display to reflect the new preempt lazy model
When the system is configured with preempt lazy, the output of the
trace data would state "unknown" for the current preemption model.
Because the lazy preemption model was just added, make it known
to the tracing subsystem too. This is just a one line change.
- Document multiple function graph having slightly different timings
Now that function graph tracer infrastructure is separate, this also
allows the function graph tracer to run in multiple instances (it wasn't
able to do so before). If two instances ran the function graph tracer and
traced the same functions, the timings for them will be slightly
different because each does their own timings and collects the timestamps
differently. Document this to not have people be confused by it.
Please pull the latest trace-v6.13-rc7 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
trace-v6.13-rc7
Tag SHA1: 343380d69115b113094248a3f6bd5f3e8b14ee59
Head SHA1: 94d529a3255ce65496e932173d7f13e801170597
Shrikanth Hegde (1):
tracing: Print lazy preemption model
Steven Rostedt (2):
tracing: Fix irqsoff and wakeup latency tracers when using function graph
ftrace: Document that multiple function_graph tracing may have different times
----
Documentation/trace/ftrace.rst | 6 ++++++
kernel/trace/trace.c | 1 +
kernel/trace/trace_irqsoff.c | 14 ++++++++++++++
kernel/trace/trace_sched_wakeup.c | 14 ++++++++++++++
4 files changed, 35 insertions(+)
---------------------------
diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
index 272464bb7c60..2b74f96d09d5 100644
--- a/Documentation/trace/ftrace.rst
+++ b/Documentation/trace/ftrace.rst
@@ -810,6 +810,12 @@ Here is the list of current tracers that may be configured.
to draw a graph of function calls similar to C code
source.
+ Note that the function graph calculates the timings of when the
+ function starts and returns internally and for each instance. If
+ there are two instances that run function graph tracer and traces
+ the same functions, the length of the timings may be slightly off as
+ each read the timestamp separately and not at the same time.
+
"blk"
The block tracer. The tracer used by the blktrace user
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index f8aebcb01e62..b6e40e8791fa 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4122,6 +4122,7 @@ print_trace_header(struct seq_file *m, struct trace_iterator *iter)
preempt_model_none() ? "server" :
preempt_model_voluntary() ? "desktop" :
preempt_model_full() ? "preempt" :
+ preempt_model_lazy() ? "lazy" :
preempt_model_rt() ? "preempt_rt" :
"unknown",
/* These are reserved for later use */
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index fce064e20570..a4e799c1e767 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -182,6 +182,7 @@ static int irqsoff_graph_entry(struct ftrace_graph_ent *trace,
struct trace_array_cpu *data;
unsigned long flags;
unsigned int trace_ctx;
+ u64 *calltime;
int ret;
if (ftrace_graph_ignore_func(gops, trace))
@@ -199,6 +200,12 @@ static int irqsoff_graph_entry(struct ftrace_graph_ent *trace,
if (!func_prolog_dec(tr, &data, &flags))
return 0;
+ calltime = fgraph_reserve_data(gops->idx, sizeof(*calltime));
+ if (!calltime)
+ return 0;
+
+ *calltime = trace_clock_local();
+
trace_ctx = tracing_gen_ctx_flags(flags);
ret = __trace_graph_entry(tr, trace, trace_ctx);
atomic_dec(&data->disabled);
@@ -213,12 +220,19 @@ static void irqsoff_graph_return(struct ftrace_graph_ret *trace,
struct trace_array_cpu *data;
unsigned long flags;
unsigned int trace_ctx;
+ u64 *calltime;
+ int size;
ftrace_graph_addr_finish(gops, trace);
if (!func_prolog_dec(tr, &data, &flags))
return;
+ calltime = fgraph_retrieve_data(gops->idx, &size);
+ if (!calltime)
+ return;
+ trace->calltime = *calltime;
+
trace_ctx = tracing_gen_ctx_flags(flags);
__trace_graph_return(tr, trace, trace_ctx);
atomic_dec(&data->disabled);
diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index d6c7f18daa15..c58292e424d5 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -118,6 +118,7 @@ static int wakeup_graph_entry(struct ftrace_graph_ent *trace,
struct trace_array *tr = wakeup_trace;
struct trace_array_cpu *data;
unsigned int trace_ctx;
+ u64 *calltime;
int ret = 0;
if (ftrace_graph_ignore_func(gops, trace))
@@ -135,6 +136,12 @@ static int wakeup_graph_entry(struct ftrace_graph_ent *trace,
if (!func_prolog_preempt_disable(tr, &data, &trace_ctx))
return 0;
+ calltime = fgraph_reserve_data(gops->idx, sizeof(*calltime));
+ if (!calltime)
+ return 0;
+
+ *calltime = trace_clock_local();
+
ret = __trace_graph_entry(tr, trace, trace_ctx);
atomic_dec(&data->disabled);
preempt_enable_notrace();
@@ -148,12 +155,19 @@ static void wakeup_graph_return(struct ftrace_graph_ret *trace,
struct trace_array *tr = wakeup_trace;
struct trace_array_cpu *data;
unsigned int trace_ctx;
+ u64 *calltime;
+ int size;
ftrace_graph_addr_finish(gops, trace);
if (!func_prolog_preempt_disable(tr, &data, &trace_ctx))
return;
+ calltime = fgraph_retrieve_data(gops->idx, &size);
+ if (!calltime)
+ return;
+ trace->calltime = *calltime;
+
__trace_graph_return(tr, trace, trace_ctx);
atomic_dec(&data->disabled);
On Thu, 16 Jan 2025 at 06:53, Steven Rostedt <rostedt@goodmis.org> wrote:
>
>
> git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
> trace-v6.13-rc7
-ENOSUCHTAG
Forgot to push out?
Linus
On Thu, 16 Jan 2025 15:02:37 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, 16 Jan 2025 at 06:53, Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git > > trace-v6.13-rc7 > > -ENOSUCHTAG > > Forgot to push out? > Ug, yeah. I pushed the branch but not the tag. Fixed. -- Steve
On Thu, 16 Jan 2025 09:53:33 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > But the irqsoff and wakeup latency tracers, when the "display-graph" > option was set, would use the function graph tracer to calculate the > times of functions during the latency. The movement of the calltime > calculation made the value zero for these tracers, and the output > no longer showed the length of time of each tracer, but instead the > absolute timestamp of when the function returned (rettime - calltime > where calltime is now zero). FYI, I have a patch that removes the calltime and rettime fields from the generic structure as they should not be used in the generic code. That is how I found this bug. I noticed that the irqsoff and wakeup tracers used that field as well, and when I tested those tracers with the display-graph option enabled, that's when I noticed the regression. But that change is a cleanup and not appropriate for an rc release. This was more of a work around to fix the regression. The proper fix will come in 6.14. -- Steve
© 2016 - 2025 Red Hat, Inc.