[RFC PATCH 0/3] rtla: Synchronize sample collection methods

Tomas Glozar posted 3 patches 2 weeks ago
include/trace/events/osnoise.h        | 14 ++++++---
kernel/trace/trace_osnoise.c          | 10 +++++--
tools/tracing/rtla/src/timerlat.bpf.c |  9 ++++++
tools/tracing/rtla/src/timerlat.c     | 43 +++++++++++++++++++++++++--
tools/tracing/rtla/src/timerlat.h     |  2 ++
tools/tracing/rtla/src/timerlat_bpf.c |  8 +++++
6 files changed, 78 insertions(+), 8 deletions(-)
[RFC PATCH 0/3] rtla: Synchronize sample collection methods
Posted by Tomas Glozar 2 weeks ago
This is a proposal to synchronize the start of sample collection throughout
all places where samples are collected in RTLA, using a lockless method of
tracking the count of active tracefs instances set to the timerlat tracer.

There are three different places where timerlat samples are collected in RTLA:
two tracefs instances (auto-analysis and record), and a BPF program attached
to the osnoise:timerlat_sample tracepoint.

The BPF program collects all samples and generates the main statistics from
them in either top or hist mode, while the auto-analysis and record instances
are used as ring buffers with the intention of capturing the tail of both
timerlat samples and additional tracepoints.

In instances where BPF support is not available, the BPF program is replaced
by a third tracefs instance, called the "main instance".

One problem with this approach is each of these sample collectors is
turned on [1] separately, in the present state of RTLA, in the following order:

- record instance
- auto-analysis instance
- BPF program / main instance

[1] By "turned on", I mean toggling tracing on (for tracefs instances) or
attachment (for the BPF program). Note that timerlat starts measurement
on tracer registration, even if tracing is off by default. This might have
been originally unintentional - considering that explicitly turning off
tracing stops the measurement threads - but is now relied upon in RTLA.

This leads to some samples being seen only by tracers that are enabled
earlier, leading to confusing results. For example, auto-analysis might
show spikes that are not seen in the histogram, and trace output (record)
might show samples that are not seen by either the histogram or
auto-analysis.

To enable RTLA to analyze samples consistently, the first patch adds two fields
to the osnoise:timerlat_sample tracepoint: instances_registered and
instances_on. During the recording of a timerlat sample, timerlat counts
how many instances are registered and how many are on, and attaches
the information to the osnoise:timerlat_sample trace event, which is moved
to occur after the samples are recorded.

The second patch makes RTLA count how many tracefs instances are to be enabled
in total, passes this number to the BPF program, and makes it drop any samples
that arrive before the instances are turned on. This ensures all samples
recorded in the main statistics (e.g. histogram) are seen by all active tracefs
instances, that is, auto-analysis and trace output (record).

The third patch then moves the attachment of the BPF program before the enabling
of the tracefs instances, provided that the kernel supports instance counting
introduced in the first patch. The second patch enforces that all samples
recorded by the program are seen by the last enabled tracefs instance.

Synchronization between different tracefs instances, that is, auto-analysis and
trace output, and the main instance in non-BPF mode, is possible by looking at
the osnoise:timerlat_sample record that comes after each sample, but is not yet
implemented in the current version of the patchset.

Tomas Glozar (3):
  tracing/osnoise: Record timerlat instance counts
  rtla/timerlat_bpf: Filter samples unseen by tracer
  rtla/timerlat: Attach BPF program before tracers

 include/trace/events/osnoise.h        | 14 ++++++---
 kernel/trace/trace_osnoise.c          | 10 +++++--
 tools/tracing/rtla/src/timerlat.bpf.c |  9 ++++++
 tools/tracing/rtla/src/timerlat.c     | 43 +++++++++++++++++++++++++--
 tools/tracing/rtla/src/timerlat.h     |  2 ++
 tools/tracing/rtla/src/timerlat_bpf.c |  8 +++++
 6 files changed, 78 insertions(+), 8 deletions(-)

-- 
2.52.0
Re: [RFC PATCH 0/3] rtla: Synchronize sample collection methods
Posted by Steven Rostedt 1 week, 6 days ago
On Fri, 23 Jan 2026 16:25:31 +0100
Tomas Glozar <tglozar@redhat.com> wrote:

> To enable RTLA to analyze samples consistently, the first patch adds two fields
> to the osnoise:timerlat_sample tracepoint: instances_registered and
> instances_on. During the recording of a timerlat sample, timerlat counts
> how many instances are registered and how many are on, and attaches
> the information to the osnoise:timerlat_sample trace event, which is moved
> to occur after the samples are recorded.

Can't RTLA simply write into trace_marker or trace_marker_raw an event
that states "tracing is now active" and ignore anything before that event.

Heck, it will include a timestamp, so you only need to write once and
ignore any event that occurred before that timestamp.

-- Steve