[PATCHES v3 00/29] perf: Harden perf.data parsing against crafted/corrupted files

Arnaldo Carvalho de Melo posted 29 patches 2 weeks ago
There is a newer version of this series
[PATCHES v3 00/29] perf: Harden perf.data parsing against crafted/corrupted files
Posted by Arnaldo Carvalho de Melo 2 weeks ago
perf.data validation and hardening (29 patches)

A crafted or corrupted perf.data file can cause out-of-bounds
reads/writes, infinite loops, heap overflows, and segfaults in perf
report, perf script, perf inject, perf timechart, and perf kwork.
This series adds defense-in-depth validation for file parsing:

- Per-event-type minimum size table, enforced before swap and
  processing on both native and cross-endian paths.

- Bounds-checking the one_mmap fast path in peek_event against the
  mapped region size, preventing OOB reads from crafted file_offset.

- Swap handler return values (void -> int) so handlers can propagate
  errors instead of silently corrupting adjacent memory.

- Bounds checking for string fields (null-termination), array counts
  (nr vs payload size), feature section sizes (vs file size), and
  CPU indices (vs nr_cpus_avail / array allocation).

- ABI0 handling for perf_event_attr.size == 0 across all code paths
  (swap, native, synthesize, read_event_desc), with consistent
  behavior regardless of file endianness.

- READ_ONCE() snapshot of event->header.size in process_user_event()
  to prevent compiler rematerialization from MAP_SHARED memory.

- Sanitizer-aware shell test: the truncated perf.data test captures
  stderr and checks for ASAN/MSAN/TSAN/UBSAN markers, since sanitizer
  exits use code 1 which otherwise looks like a clean error exit.

Pre-existing bugs fixed along the way:

- event_contains() macro off-by-one (checked start, not full extent)

- zstd_decompress_stream multi-iteration output.pos bug
 
- zstd_compress_stream_to_records: broken memcpy fallback -> return -1
  + ZSTD context reset + dst_size underflow guard
 
- PERF_RECORD_SWITCH sample_id_all offset wrong for non-CPU_WIDE
 
- cpu_map__from_range any_cpu used as count instead of boolean
 
- cpu_map__from_mask double-fetch heap overflow (j >= weight guard)
 
- kwork cpus_runtime BUG_ON with signed comparison
 
- perf_header__getbuffer64 EOF without errno (silent success)
 
- read_event_desc ABI0 sentinel (attr.size=0 -> free_event_desc early stop)
 
- EVENT_UPDATE MASK: missing offsetof underflow guard + pr_warning on
  mask32/mask64 validation paths

Additional pre-existing issues were noticed during review and will be
addressed in follow-up series.

Testing
-------

- perf test at baseline and at patches 1, 8, 11, 17, 21, 26, 29
  with 300s timeout -- no regressions detected.
- Build with both gcc and clang at every patch.
- checkpatch.pl on all 29 patches.
- Full root perf test on x86_64 (x1, i7-1260P) and aarch64
  (Raspberry Pi 4, Cortex-A72, Debian trixie).

Developed with AI assistance (Claude/sashiko), tagged in commits.

Changes in v2
-------------

- Patch 8: strnlen with 'end - data' limit instead of open-ended strlen
- Patch 10: ABI0 attr.size==0 handling for native-endian path
- Patch 13: READ_ONCE snapshot for mask32_data.nr, long_size validation
- Patch 17: attr_size bounds check for all PRINT_ATTRn macros

Changes in v3
-------------

- Patch 10: fix perf_event__repipe_attr() in builtin-inject.c to
  handle ABI0 attr.size==0 — was using the raw size for memcpy and
  the perf_record_header_attr_id() macro, which both break when
  attr.size is 0.
- Patch 12: add sample_id_all handling to perf_event__build_id_swap()
  — perf_event__synthesize_build_id() appends id_sample data, so
  cross-endian pipe mode must swap those trailing fields.
- Patch 24: remove comp_mmap_len upper-bound cap that rejected valid
  perf record -m 2G recordings (mmap_len exceeds 2GB - 4096).  The
  downstream decompression path already checks against SIZE_MAX.

Cheers,

- Arnaldo