An intel-pt trace can be turned into LBR events either in perf script
or perf inject with the --itrace=L option. With perf inject the
generated perf.data file failed to be parsed as the sample events were
out of sync with their perf_event_attr. A range of fixes were
required.
This patch was separated from a large perf script refactor that
highlighted the breakage:
https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
v8:
- Avoid potential NULL pointer dereference of evsel in Commit 2:
* If machines__deliver_event() processes a malformed PERF_RECORD_READ
event with an unknown or missing sample ID, the evsel is passed
as NULL.
* Added an early evsel == NULL check in perf_event__repipe_sample()
to safely fallback and repipe the raw event unmodified, preventing
any segmentation faults.
v7:
- Fixed a critical NULL pointer dereference crash in Commit 2:
* In tools/perf/util/intel-pt.c, when branch stack injection is
requested (add_last_branch is true) but last_branch is false (such
as in perf inject --itrace=L), ptq->last_branch was not allocated.
* When PEBS branch stack synthesis is forced via
evsel->synth_sample_type, the code dereferenced ptq->last_branch
inside do_synth_pebs_sample (either in intel_pt_add_lbrs or by
setting nr = 0), causing a SEGSEGV.
* Fixed by ensuring ptq->last_branch is successfully allocated in
intel_pt_alloc_queue() when add_last_branch is requested.
v6:
- Address critical security and correctness feedback in Commit 2:
* Fixed potential out-of-bounds read in perf_event__repipe_attr() by
moving the header.size validation check before accessing
event->attr.attr.size or copying.
* Prevented 32-bit integer wrapping overflow on header.size validation
by using subtraction instead of addition.
- Restored PEBS LBR synthesis fixes in tools/perf/util/intel-pt.c.
v5:
- Restored the missing PEBS branch stack synthesis fixes in intel-pt.c
which were accidentally dropped in a previous rebase/conflict
resolution.
- Addressed the pipe mode size mismatch and ID array out-of-bounds read:
* Safely copy the incoming attribute payload using min_t and
memset zero, preventing trailing ID corruption.
* Explicitly set the synthesized event's attr.size to match the tool's
physical sizeof(struct perf_event_attr), guaranteeing perfect offset
alignment and removing any risk of hallucinated/garbage IDs or
underflow out-of-bounds reads.
- Refactored both commit descriptions to strictly focus on code changes,
deferring meta-commentary and implementation details exclusively to
the cover letter.
v4:
- Avoid temporary regressions in Commit 1:
* Used local masked sample_type in convert_sample_callchain instead of
unmasked evsel attribute, preventing heap overflows.
* Promoted hardware tracer signature changes and dynamic retrieval of
branch_sample_type to Commit 1, removing hardcoded 0 bugs.
* Checked sample->evsel first before performing evlist__id2evsel lookup
to optimize evsel retrieval when already populated.
- Address critical security and correctness feedback in Commit 2:
* Added check in perf_event__repipe_attr to prevent n_ids underflow.
* Fixed early return error path in perf_event__repipe_sample to
prevent state corruption and dangling pointers on dummy_bs.
* Ensured perf_inject__cut_auxtrace_sample cuts the 8-byte size field
even when aux_sample.size is 0 to prevent parser misalignment.
* Expanded older attributes to PERF_ATTR_SIZE_VER2 in file mode
within __cmd_inject to prevent silent truncation of
branch_sample_type.
* Added bounds checks against PERF_SAMPLE_MAX_SIZE to all hardware
tracing synthetic helpers to prevent heap buffer overflows.
* Fixed checkpatch.pl warnings/errors for line-wrapping.
v3:
- Add missing Fixes: tags on both commits.
- Refactor perf_event__repipe_attr to avoid in-place modifications on
read-only mmap buffers, preventing SIGSEGV in file mode and premature
evsel updates in pipe mode.
- Use perf_event__synthesize_attr to correctly construct and repipe
attributes in pipe mode.
- Replace manual arithmetic in convert_sample_callchain with
perf_event__sample_event_size to prevent uninitialized memory leaks.
- Retrieve evsel branch_sample_type dynamically in util/arm-spe.c and
util/cs-etm.c instead of hardcoding 0, resolving missing hw_idx field
on synthesized branch stacks.
v2: Response to sashiko fixes for patch 2, Namhyung's acked-by for patch 1.
v1: https://lore.kernel.org/lkml/20260428070328.1880314-1-irogers@google.com/
Ian Rogers (2):
perf event: Fix size of synthesized sample with branch stacks
perf inject: Fix itrace branch stack synthesis
tools/perf/bench/inject-buildid.c | 9 +-
tools/perf/builtin-inject.c | 165 +++++++++++++++++++++++++----
tools/perf/tests/dlfilter-test.c | 8 +-
tools/perf/tests/sample-parsing.c | 5 +-
tools/perf/util/arm-spe.c | 28 ++++-
tools/perf/util/cs-etm.c | 28 ++++-
tools/perf/util/intel-bts.c | 3 +-
tools/perf/util/intel-pt.c | 35 ++++--
tools/perf/util/synthetic-events.c | 25 +++--
tools/perf/util/synthetic-events.h | 6 +-
10 files changed, 260 insertions(+), 52 deletions(-)
--
2.54.0.631.ge1b05301d1-goog