[PATCH v4] libbpf: harden parse_vma_segs() path parsing

Michael Bommarito posted 1 patch 1 day, 21 hours ago
tools/lib/bpf/usdt.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
[PATCH v4] libbpf: harden parse_vma_segs() path parsing
Posted by Michael Bommarito 1 day, 21 hours ago
parse_vma_segs() in tools/lib/bpf/usdt.c parses /proc/<pid>/maps
with two widthless scansets, "%s" into mode[16] and "%[^\n]"
into line[4096]. A VMA name in maps is not limited to that local
buffer; a deeply nested backing path can produce a maps record long
enough to overflow the stack buffer.

Bound both scansets to the declared buffer sizes ("%15s" for mode[16]
and "%4095[^\n]" for line[4096]) and drain any residue past line[4094]
with "%*[^\n]" before the trailing "\n". Without the drain, the residue
of an over-long record would stay in the stream and break the next
"%zx-%zx" parse, so the loop would exit early and silently skip later
maps records.

Also stop using sscanf(..., "%s") to peel the /proc/<pid>/root prefix
from lib_path. Parse the pid and prefix length with "%n", check for the
following slash, and copy the remainder with libbpf_strlcpy(). That
removes a second unbounded stack write and preserves paths containing
spaces.

Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")
Cc: stable@vger.kernel.org
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
v4:
- Carry Emil's Reviewed-by.
- Simplify the /proc/<pid>/root prefix handling with sscanf() %n,
  removing the unreachable snprintf() length check.
- Initialize and check the %n output before using it, so partial
  literal matches after the pid cannot use an unassigned offset.
- Add a short comment for the %n return-value rule.
- Declare the maps-line buffer as line[4096] to match the %4095
  scanset width.
- Reword the maps-line comment without seq_file implementation detail.

v3:
- Correct Fixes tag to the initial USDT implementation commit,
  per BPF CI review after adding second site.

v2:
- Replace the unbounded /proc/<pid>/root sscanf() path peeling with
  bounded prefix handling, addressing review feedback on v1 and
  preserving paths containing spaces.
- Keep the v1 maps parser fix using bounded fscanf() scansets and a
  suppressed scanset drain for over-long records.
- Re-ran real parse_vma_segs() ASAN harnesses for the original maps
  overflow, the proc-root overflow, proc-root paths with spaces, and
  adjacent successful parses after an over-long maps record.

Reproduced with Debian 12 on rootless podman: an unprivileged
container process mkdirs 50 nested 200-char directories and mmaps
a file at the bottom, producing a >10KB /proc/<host-pid>/maps
line.  A harness on the host then calls the real parse_vma_segs()
against the container's PID; libbpf is built with
-fsanitize=address and the only local source change is dropping
the "static" keyword on parse_vma_segs so the symbol is linkable
from the harness.

Stock libbpf reports:

  ==ERROR: AddressSanitizer: stack-buffer-overflow
  WRITE of size >10KB at <line> thread T0
    #0 scanf_common -> #1 __isoc99_fscanf
    #3 parse_vma_segs tools/lib/bpf/usdt.c:509
  Address ... in frame parse_vma_segs at offset 8512, just past
  line[PATH_MAX].

Patched libbpf parses the same maps cleanly.  Follow-up calls
return 0 with seg_cnt > 0 for libc.so.6 and for
ld-linux-x86-64.so.2 (format drain), which appears in maps
after the over-long entry.

On normal hardened builds the stack canary aborts the consumer;
on builds without stack protector the bytes past line[] are
attacker-influenced path bytes.

Selftest gate
=============

tools/testing/selftests/bpf/test_progs -t usdt under QEMU x86_64
(KVM) on the patched kernel: all 6 subtests pass (usdt/basic,
basic_optimized, optimized_attach, multispec, urand_auto_attach,
urand_pid_attach) on both stock and patched libbpf, diff-clean.
The in-tree selftest does not itself exercise long maps records.

 tools/lib/bpf/usdt.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
index e3710933fd52a..57fb82bb81b58 100644
--- a/tools/lib/bpf/usdt.c
+++ b/tools/lib/bpf/usdt.c
@@ -468,10 +468,10 @@ static int parse_elf_segs(Elf *elf, const char *path, struct elf_seg **segs, siz
 
 static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs, size_t *seg_cnt)
 {
-	char path[PATH_MAX], line[PATH_MAX], mode[16];
+	char path[PATH_MAX], line[4096], mode[16];
 	size_t seg_start, seg_end, seg_off;
 	struct elf_seg *seg;
-	int tmp_pid, i, err;
+	int tmp_pid, n, i, err;
 	FILE *f;
 
 	*seg_cnt = 0;
@@ -480,8 +480,13 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
 	 * /proc/<pid>/root/<path>. They will be reported as just /<path> in
 	 * /proc/<pid>/maps.
 	 */
-	if (sscanf(lib_path, "/proc/%d/root%s", &tmp_pid, path) == 2 && pid == tmp_pid)
+	/* %n is not counted in sscanf() return value, so initialize it. */
+	n = 0;
+	if (sscanf(lib_path, "/proc/%d/root%n", &tmp_pid, &n) == 1 &&
+	    n > 0 && pid == tmp_pid && lib_path[n] == '/') {
+		libbpf_strlcpy(path, lib_path + n, sizeof(path));
 		goto proceed;
+	}
 
 	if (!realpath(lib_path, path)) {
 		pr_warn("usdt: failed to get absolute path of '%s' (err %s), using path as is...\n",
@@ -504,8 +509,11 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
 	 * 7f5c6f5d1000-7f5c6f5d3000 rw-p 001c7000 08:04 21238613      /usr/lib64/libc-2.17.so
 	 * 7f5c6f5d3000-7f5c6f5d8000 rw-p 00000000 00:00 0
 	 * 7f5c6f5d8000-7f5c6f5d9000 r-xp 00000000 103:01 362990598    /data/users/andriin/linux/tools/bpf/usdt/libhello_usdt.so
+	 *
+	 * Some VMA names can be longer than the local buffer. Bound the
+	 * writes, but still consume the rest of the line.
 	 */
-	while (fscanf(f, "%zx-%zx %s %zx %*s %*d%[^\n]\n",
+	while (fscanf(f, "%zx-%zx %15s %zx %*s %*d%4095[^\n]%*[^\n]\n",
 		      &seg_start, &seg_end, mode, &seg_off, line) == 5) {
 		void *tmp;
 
-- 
2.53.0