Document both --on-threshold and --on-end, with examples.
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
---
.../tools/rtla/common_timerlat_options.rst | 64 +++++++++++++++++++
1 file changed, 64 insertions(+)
diff --git a/Documentation/tools/rtla/common_timerlat_options.rst b/Documentation/tools/rtla/common_timerlat_options.rst
index 10dc802f8d65..7854368f1827 100644
--- a/Documentation/tools/rtla/common_timerlat_options.rst
+++ b/Documentation/tools/rtla/common_timerlat_options.rst
@@ -55,3 +55,67 @@
Set timerlat to run without workload, waiting for the user to dispatch a per-cpu
task that waits for a new period on the tracing/osnoise/per_cpu/cpu$ID/timerlat_fd.
See linux/tools/rtla/sample/timerlat_load.py for an example of user-load code.
+
+**--on-threshold** *action*
+
+ Defines an action to be executed when tracing is stopped on a latency threshold
+ specified by **-i/--irq** or **-T/--thread**.
+
+ Multiple --on-threshold actions may be specified, and they will be executed in
+ the order they are provided. If any action fails, subsequent actions in the list
+ will not be executed.
+
+ Supported actions are:
+
+ - *trace[,file=<filename>]*
+
+ Saves trace output, optionally taking a filename. Alternative to -t/--trace.
+ Note that nlike -t/--trace, specifying this multiple times will result in
+ the trace being saved multiple times.
+
+ - *signal,num=<sig>,pid=<pid>*
+
+ Sends signal to process. "parent" might be specified in place of pid to target
+ the parent process of rtla.
+
+ - *shell,command=<command>*
+
+ Execute shell command.
+
+ - *continue*
+
+ Continue tracing after actions are executed instead of stopping.
+
+ Example:
+
+ $ rtla timerlat -T 20 --on-threshold trace
+ --on-threshold shell,command="grep ipi_send timerlat_trace.txt"
+ --on-threshold signal,num=2,pid=parent
+
+ This will save a trace with the default filename "timerlat_trace.txt", print its
+ lines that contain the text "ipi_send" on standard output, and send signal 2
+ (SIGINT) to the parent process.
+
+ Performance Considerations:
+
+ For time-sensitive actions, it is recommended to run **rtla timerlat** with BPF
+ support and RT priority. Note that due to implementational limitations, actions
+ might be delayed up to one second after tracing is stopped if BPF mode is not
+ available or disabled.
+
+**--on-end** *action*
+
+ Defines an action to be executed at the end of **rtla timerlat** tracing.
+
+ Multiple --on-end actions can be specified, and they will be executed in the order
+ they are provided. If any action fails, subsequent actions in the list will not be
+ executed.
+
+ See the documentation for **--on-threshold** for the list of supported actions, with
+ the exception that *continue* has no effect.
+
+ Example:
+
+ $ rtla timerlat -d 5s --on-end trace
+
+ This runs rtla timerlat with default options and save trace output at the end.
--
2.49.0
On Thu, 26 Jun 2025 14:34:05 +0200 Tomas Glozar <tglozar@redhat.com> wrote: > + For time-sensitive actions, it is recommended to run **rtla timerlat** with BPF > + support and RT priority. Note that due to implementational limitations, actions > + might be delayed up to one second after tracing is stopped if BPF mode is not > + available or disabled. > + I'm curious to what is looked for for triggering an action. We can poll on events and get woken when they are triggered. It may be possible to add even more ways to wake a task waiting for something to happen. -- Steve
út 22. 7. 2025 v 0:35 odesílatel Steven Rostedt <rostedt@goodmis.org> napsal: > > I'm curious to what is looked for for triggering an action. We can poll on > events and get woken when they are triggered. It may be possible to add > even more ways to wake a task waiting for something to happen. > Threshold actions are triggered immediately after a sample over the set threshold is detected by rtla. For BPF mode, this happens almost right after the sample is processed in the BPF program and the scheduler gets to waking up rtla following a BPF ringbuffer write. There is only a short delay (up to tens of microseconds) because the BPF helper defers the wake-up into irq_work. For non-BPF mode, rtla periodically pulls samples from tracefs, when it does that, it also checks whether tracing has been turned off. If yes, that means there was a threshold overflow, and actions are triggered. Since the period for that is currently set to 1 second, the action might be delayed up to one second from the threshold occurring, That delay might be a problem if you need to collect a lot of data from a ringbuffer in the action, e.g. global Intel PT data collection for precise troubleshooting of difficult latencies. Of course, this is just an implementational limitation of the timerlat tracer. If timerlat had an event (like osnoise's "sample_threshold") triggered on threshold overflow and if it is possible to wait on it even without BPF, rtla could wait on that for both BPF and non-BPF mode instead of what it is currently doing. Tomas
On Tue, 22 Jul 2025 09:03:24 +0200 Tomas Glozar <tglozar@redhat.com> wrote: > Of course, this is just an implementational limitation of the timerlat > tracer. If timerlat had an event (like osnoise's "sample_threshold") > triggered on threshold overflow and if it is possible to wait on it > even without BPF, rtla could wait on that for both BPF and non-BPF > mode instead of what it is currently doing. Right. Is this something you may want? -- Steve
út 22. 7. 2025 v 17:30 odesílatel Steven Rostedt <rostedt@goodmis.org> napsal: > > On Tue, 22 Jul 2025 09:03:24 +0200 > Tomas Glozar <tglozar@redhat.com> wrote: > > > Of course, this is just an implementational limitation of the timerlat > > tracer. If timerlat had an event (like osnoise's "sample_threshold") > > triggered on threshold overflow and if it is possible to wait on it > > even without BPF, rtla could wait on that for both BPF and non-BPF > > mode instead of what it is currently doing. > > Right. Is this something you may want? > I don't think it is that important. Non-BPF mode is mostly as a fallback for users of rtla on older kernels which don't have the osnoise:timerlat_sample trace event. Those are (I assume) mostly users of LTS distributions who run newer rtla from a container. Adding a new event wouldn't help in their case. The only users who would benefit from that are those who don't have BPF or libbpf. If there is interest from using low-latency actions on threshold in such settings, I'm not against implementing a threshold overflow tracepoint and supporting it in rtla for triggering actions on threshold. Tomas
© 2016 - 2025 Red Hat, Inc.