tools/lib/bpf/usdt.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
parse_vma_segs() in tools/lib/bpf/usdt.c parses /proc/<pid>/maps
with two widthless scansets, "%s" into mode[16] and "%[^\n]"
into line[PATH_MAX]. Both assume the kernel caps maps records to
PATH_MAX; it does not.
show_map_vma() emits the path via seq_path() against the seq buffer,
which doubles on overflow (m->size <<= 1 in fs/seq_file.c), so a VMA
whose backing path is a deeply nested directory tree produces a single
maps record longer than PATH_MAX. scanf "%s" / "%[^\n]" without a
width writes until the field terminator regardless of destination size,
so a bpf_program__attach_usdt() consumer attaching against an
attacker-controlled PID overflows its own stack inside parse_vma_segs().
Bound both scansets to the declared buffer sizes ("%15s" for mode[16]
and "%4095[^\n]" for line[PATH_MAX]) and drain any residue past
line[4094] with "%*[^\n]" before the trailing "\n", matching the
libbpf-local fscanf style. Without the drain the residue of an
over-long record would stay in the stream and break the next "%zx-%zx"
parse, so the loop would exit early and any maps records after the
over-long entry would be silently skipped.
Also stop using sscanf(..., "%s") to peel the /proc/<pid>/root prefix
from lib_path. Build the exact prefix for the requested PID with
snprintf(), check it directly, and copy the remainder with
libbpf_strlcpy(). That removes a second unbounded stack write and
preserves paths containing spaces.
Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
v3:
- Correct Fixes tag to the initial USDT implementation commit,
per BPF CI review after adding second site.
v2:
- Replace the unbounded /proc/<pid>/root sscanf() path peeling with
snprintf() + prefix check + libbpf_strlcpy(), addressing review
feedback on v1 and preserving paths containing spaces.
- Keep the v1 maps parser fix using bounded fscanf() scansets and a
suppressed scanset drain for over-long records.
- Re-ran real parse_vma_segs() ASAN harnesses for the original maps
overflow, the proc-root overflow, proc-root paths with spaces, and
adjacent successful parses after an over-long maps record.
Reproduced with Debian 12 on rootless podman: an unprivileged
container process mkdirs 50 nested 200-char directories and mmaps
a file at the bottom, producing a 10403-byte /proc/<host-pid>/maps
line. A harness on the host then calls the real parse_vma_segs()
against the container's PID; libbpf is built with
-fsanitize=address and the only local source change is dropping
the "static" keyword on parse_vma_segs so the symbol is linkable
from the harness.
Stock libbpf reports:
==ERROR: AddressSanitizer: stack-buffer-overflow
WRITE of size 10349 at <line> thread T0
#0 scanf_common -> #1 __isoc99_fscanf
#3 parse_vma_segs tools/lib/bpf/usdt.c:509
Address ... in frame parse_vma_segs at offset 8512, just past
line[PATH_MAX].
Patched libbpf parses the same maps cleanly. Follow-up calls
return 0 with seg_cnt > 0 for libc.so.6 and for
ld-linux-x86-64.so.2 (format drain), which appears in maps
after the over-long entry.
On normal hardened builds the stack canary aborts the consumer;
on builds without stack protector the bytes past line[] are
attacker-influenced path bytes.
Selftest gate
=============
tools/testing/selftests/bpf/test_progs -t usdt under QEMU x86_64
(KVM) on the patched kernel: all 6 subtests pass (usdt/basic,
basic_optimized, optimized_attach, multispec, urand_auto_attach,
urand_pid_attach) on both stock and patched libbpf, diff-clean.
The in-tree selftest does not itself exercise long maps records.
tools/lib/bpf/usdt.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
index e3710933fd52a..2ed792cf11438 100644
--- a/tools/lib/bpf/usdt.c
+++ b/tools/lib/bpf/usdt.c
@@ -471,7 +471,7 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
char path[PATH_MAX], line[PATH_MAX], mode[16];
size_t seg_start, seg_end, seg_off;
struct elf_seg *seg;
- int tmp_pid, i, err;
+ int n, i, err;
FILE *f;
*seg_cnt = 0;
@@ -480,8 +480,13 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
* /proc/<pid>/root/<path>. They will be reported as just /<path> in
* /proc/<pid>/maps.
*/
- if (sscanf(lib_path, "/proc/%d/root%s", &tmp_pid, path) == 2 && pid == tmp_pid)
+ n = snprintf(line, sizeof(line), "/proc/%d/root", pid);
+ if (n < 0 || n >= (int)sizeof(line))
+ return -ENAMETOOLONG;
+ if (str_has_pfx(lib_path, line) && lib_path[n] == '/') {
+ libbpf_strlcpy(path, lib_path + n, sizeof(path));
goto proceed;
+ }
if (!realpath(lib_path, path)) {
pr_warn("usdt: failed to get absolute path of '%s' (err %s), using path as is...\n",
@@ -504,8 +509,11 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
* 7f5c6f5d1000-7f5c6f5d3000 rw-p 001c7000 08:04 21238613 /usr/lib64/libc-2.17.so
* 7f5c6f5d3000-7f5c6f5d8000 rw-p 00000000 00:00 0
* 7f5c6f5d8000-7f5c6f5d9000 r-xp 00000000 103:01 362990598 /data/users/andriin/linux/tools/bpf/usdt/libhello_usdt.so
+ *
+ * Bound the writes and drain residue: maps lines can exceed
+ * PATH_MAX when seq_path() uses a larger seq buffer.
*/
- while (fscanf(f, "%zx-%zx %s %zx %*s %*d%[^\n]\n",
+ while (fscanf(f, "%zx-%zx %15s %zx %*s %*d%4095[^\n]%*[^\n]\n",
&seg_start, &seg_end, mode, &seg_off, line) == 5) {
void *tmp;
--
2.53.0
On Fri, May 22, 2026 at 5:57 AM Michael Bommarito
<michael.bommarito@gmail.com> wrote:
>
> parse_vma_segs() in tools/lib/bpf/usdt.c parses /proc/<pid>/maps
> with two widthless scansets, "%s" into mode[16] and "%[^\n]"
> into line[PATH_MAX]. Both assume the kernel caps maps records to
> PATH_MAX; it does not.
>
> show_map_vma() emits the path via seq_path() against the seq buffer,
> which doubles on overflow (m->size <<= 1 in fs/seq_file.c), so a VMA
> whose backing path is a deeply nested directory tree produces a single
> maps record longer than PATH_MAX. scanf "%s" / "%[^\n]" without a
> width writes until the field terminator regardless of destination size,
> so a bpf_program__attach_usdt() consumer attaching against an
> attacker-controlled PID overflows its own stack inside parse_vma_segs().
>
> Bound both scansets to the declared buffer sizes ("%15s" for mode[16]
> and "%4095[^\n]" for line[PATH_MAX]) and drain any residue past
> line[4094] with "%*[^\n]" before the trailing "\n", matching the
> libbpf-local fscanf style. Without the drain the residue of an
> over-long record would stay in the stream and break the next "%zx-%zx"
> parse, so the loop would exit early and any maps records after the
> over-long entry would be silently skipped.
>
> Also stop using sscanf(..., "%s") to peel the /proc/<pid>/root prefix
> from lib_path. Build the exact prefix for the requested PID with
> snprintf(), check it directly, and copy the remainder with
> libbpf_strlcpy(). That removes a second unbounded stack write and
> preserves paths containing spaces.
>
> Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
> ---
> v3:
> - Correct Fixes tag to the initial USDT implementation commit,
> per BPF CI review after adding second site.
>
> v2:
> - Replace the unbounded /proc/<pid>/root sscanf() path peeling with
> snprintf() + prefix check + libbpf_strlcpy(), addressing review
> feedback on v1 and preserving paths containing spaces.
> - Keep the v1 maps parser fix using bounded fscanf() scansets and a
> suppressed scanset drain for over-long records.
> - Re-ran real parse_vma_segs() ASAN harnesses for the original maps
> overflow, the proc-root overflow, proc-root paths with spaces, and
> adjacent successful parses after an over-long maps record.
>
> Reproduced with Debian 12 on rootless podman: an unprivileged
> container process mkdirs 50 nested 200-char directories and mmaps
> a file at the bottom, producing a 10403-byte /proc/<host-pid>/maps
> line. A harness on the host then calls the real parse_vma_segs()
> against the container's PID; libbpf is built with
> -fsanitize=address and the only local source change is dropping
> the "static" keyword on parse_vma_segs so the symbol is linkable
> from the harness.
>
> Stock libbpf reports:
>
> ==ERROR: AddressSanitizer: stack-buffer-overflow
> WRITE of size 10349 at <line> thread T0
> #0 scanf_common -> #1 __isoc99_fscanf
> #3 parse_vma_segs tools/lib/bpf/usdt.c:509
> Address ... in frame parse_vma_segs at offset 8512, just past
> line[PATH_MAX].
>
> Patched libbpf parses the same maps cleanly. Follow-up calls
> return 0 with seg_cnt > 0 for libc.so.6 and for
> ld-linux-x86-64.so.2 (format drain), which appears in maps
> after the over-long entry.
>
> On normal hardened builds the stack canary aborts the consumer;
> on builds without stack protector the bytes past line[] are
> attacker-influenced path bytes.
>
> Selftest gate
> =============
>
> tools/testing/selftests/bpf/test_progs -t usdt under QEMU x86_64
> (KVM) on the patched kernel: all 6 subtests pass (usdt/basic,
> basic_optimized, optimized_attach, multispec, urand_auto_attach,
> urand_pid_attach) on both stock and patched libbpf, diff-clean.
> The in-tree selftest does not itself exercise long maps records.
>
> tools/lib/bpf/usdt.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
> index e3710933fd52a..2ed792cf11438 100644
> --- a/tools/lib/bpf/usdt.c
> +++ b/tools/lib/bpf/usdt.c
> @@ -471,7 +471,7 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> char path[PATH_MAX], line[PATH_MAX], mode[16];
> size_t seg_start, seg_end, seg_off;
> struct elf_seg *seg;
> - int tmp_pid, i, err;
> + int n, i, err;
> FILE *f;
>
> *seg_cnt = 0;
> @@ -480,8 +480,13 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> * /proc/<pid>/root/<path>. They will be reported as just /<path> in
> * /proc/<pid>/maps.
> */
> - if (sscanf(lib_path, "/proc/%d/root%s", &tmp_pid, path) == 2 && pid == tmp_pid)
> + n = snprintf(line, sizeof(line), "/proc/%d/root", pid);
> + if (n < 0 || n >= (int)sizeof(line))
> + return -ENAMETOOLONG;
this can never happen, no matter the pid value
> + if (str_has_pfx(lib_path, line) && lib_path[n] == '/') {
seems like an overkill, I get the point of not using unbounded %s, but
we can do the same trick you are doing with %* below, so
if (sscanf(lib_path, "/proc/%d/root%n%*[^/n]", &n, &pid) == 1 && pid
== tmp_pid) {
libbpf_strlcpy(path, lib_path + n, sizeof(path));
goto proceed;
}
or something along those lines?
> + libbpf_strlcpy(path, lib_path + n, sizeof(path));
> goto proceed;
> + }
>
> if (!realpath(lib_path, path)) {
> pr_warn("usdt: failed to get absolute path of '%s' (err %s), using path as is...\n",
> @@ -504,8 +509,11 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> * 7f5c6f5d1000-7f5c6f5d3000 rw-p 001c7000 08:04 21238613 /usr/lib64/libc-2.17.so
> * 7f5c6f5d3000-7f5c6f5d8000 rw-p 00000000 00:00 0
> * 7f5c6f5d8000-7f5c6f5d9000 r-xp 00000000 103:01 362990598 /data/users/andriin/linux/tools/bpf/usdt/libhello_usdt.so
> + *
> + * Bound the writes and drain residue: maps lines can exceed
> + * PATH_MAX when seq_path() uses a larger seq buffer.
Claudespeak. Just say in human language that in some weird cases VMA
name/path can be longer than PATH_MAX, so we protect from that but
need make sure we consume the entire line with $* argument. Who cares
about seq_path() internals and its buffer usages?..
> */
> - while (fscanf(f, "%zx-%zx %s %zx %*s %*d%[^\n]\n",
> + while (fscanf(f, "%zx-%zx %15s %zx %*s %*d%4095[^\n]%*[^\n]\n",
> &seg_start, &seg_end, mode, &seg_off, line) == 5) {
given you hard-coded 4095 in sscanf format string, let's use 4096 in
line declaration instead of PATH_MAX, ok?
pw-bot: cr
> void *tmp;
>
> --
> 2.53.0
On Fri May 22, 2026 at 8:56 AM EDT, Michael Bommarito wrote:
> parse_vma_segs() in tools/lib/bpf/usdt.c parses /proc/<pid>/maps
> with two widthless scansets, "%s" into mode[16] and "%[^\n]"
> into line[PATH_MAX]. Both assume the kernel caps maps records to
> PATH_MAX; it does not.
>
> show_map_vma() emits the path via seq_path() against the seq buffer,
> which doubles on overflow (m->size <<= 1 in fs/seq_file.c), so a VMA
> whose backing path is a deeply nested directory tree produces a single
> maps record longer than PATH_MAX. scanf "%s" / "%[^\n]" without a
> width writes until the field terminator regardless of destination size,
> so a bpf_program__attach_usdt() consumer attaching against an
> attacker-controlled PID overflows its own stack inside parse_vma_segs().
>
> Bound both scansets to the declared buffer sizes ("%15s" for mode[16]
> and "%4095[^\n]" for line[PATH_MAX]) and drain any residue past
> line[4094] with "%*[^\n]" before the trailing "\n", matching the
> libbpf-local fscanf style. Without the drain the residue of an
> over-long record would stay in the stream and break the next "%zx-%zx"
> parse, so the loop would exit early and any maps records after the
> over-long entry would be silently skipped.
>
> Also stop using sscanf(..., "%s") to peel the /proc/<pid>/root prefix
> from lib_path. Build the exact prefix for the requested PID with
> snprintf(), check it directly, and copy the remainder with
> libbpf_strlcpy(). That removes a second unbounded stack write and
> preserves paths containing spaces.
>
> Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> ---
> v3:
> - Correct Fixes tag to the initial USDT implementation commit,
> per BPF CI review after adding second site.
>
> v2:
> - Replace the unbounded /proc/<pid>/root sscanf() path peeling with
> snprintf() + prefix check + libbpf_strlcpy(), addressing review
> feedback on v1 and preserving paths containing spaces.
> - Keep the v1 maps parser fix using bounded fscanf() scansets and a
> suppressed scanset drain for over-long records.
> - Re-ran real parse_vma_segs() ASAN harnesses for the original maps
> overflow, the proc-root overflow, proc-root paths with spaces, and
> adjacent successful parses after an over-long maps record.
>
> Reproduced with Debian 12 on rootless podman: an unprivileged
> container process mkdirs 50 nested 200-char directories and mmaps
> a file at the bottom, producing a 10403-byte /proc/<host-pid>/maps
> line. A harness on the host then calls the real parse_vma_segs()
> against the container's PID; libbpf is built with
> -fsanitize=address and the only local source change is dropping
> the "static" keyword on parse_vma_segs so the symbol is linkable
> from the harness.
>
> Stock libbpf reports:
>
> ==ERROR: AddressSanitizer: stack-buffer-overflow
> WRITE of size 10349 at <line> thread T0
> #0 scanf_common -> #1 __isoc99_fscanf
> #3 parse_vma_segs tools/lib/bpf/usdt.c:509
> Address ... in frame parse_vma_segs at offset 8512, just past
> line[PATH_MAX].
>
> Patched libbpf parses the same maps cleanly. Follow-up calls
> return 0 with seg_cnt > 0 for libc.so.6 and for
> ld-linux-x86-64.so.2 (format drain), which appears in maps
> after the over-long entry.
>
> On normal hardened builds the stack canary aborts the consumer;
> on builds without stack protector the bytes past line[] are
> attacker-influenced path bytes.
>
> Selftest gate
> =============
>
> tools/testing/selftests/bpf/test_progs -t usdt under QEMU x86_64
> (KVM) on the patched kernel: all 6 subtests pass (usdt/basic,
> basic_optimized, optimized_attach, multispec, urand_auto_attach,
> urand_pid_attach) on both stock and patched libbpf, diff-clean.
> The in-tree selftest does not itself exercise long maps records.
>
> tools/lib/bpf/usdt.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
> index e3710933fd52a..2ed792cf11438 100644
> --- a/tools/lib/bpf/usdt.c
> +++ b/tools/lib/bpf/usdt.c
> @@ -471,7 +471,7 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> char path[PATH_MAX], line[PATH_MAX], mode[16];
> size_t seg_start, seg_end, seg_off;
> struct elf_seg *seg;
> - int tmp_pid, i, err;
> + int n, i, err;
> FILE *f;
>
> *seg_cnt = 0;
> @@ -480,8 +480,13 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> * /proc/<pid>/root/<path>. They will be reported as just /<path> in
> * /proc/<pid>/maps.
> */
> - if (sscanf(lib_path, "/proc/%d/root%s", &tmp_pid, path) == 2 && pid == tmp_pid)
> + n = snprintf(line, sizeof(line), "/proc/%d/root", pid);
Minor nit in case this gets respun: Can you mention that the n >= int(sizeof(line)) check also
takes care of making sure the libbpf_strlcpy below doesn't overflow path?
> + if (n < 0 || n >= (int)sizeof(line))
> + return -ENAMETOOLONG;
> + if (str_has_pfx(lib_path, line) && lib_path[n] == '/') {
> + libbpf_strlcpy(path, lib_path + n, sizeof(path));
> goto proceed;
> + }
>
> if (!realpath(lib_path, path)) {
> pr_warn("usdt: failed to get absolute path of '%s' (err %s), using path as is...\n",
> @@ -504,8 +509,11 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> * 7f5c6f5d1000-7f5c6f5d3000 rw-p 001c7000 08:04 21238613 /usr/lib64/libc-2.17.so
> * 7f5c6f5d3000-7f5c6f5d8000 rw-p 00000000 00:00 0
> * 7f5c6f5d8000-7f5c6f5d9000 r-xp 00000000 103:01 362990598 /data/users/andriin/linux/tools/bpf/usdt/libhello_usdt.so
> + *
> + * Bound the writes and drain residue: maps lines can exceed
> + * PATH_MAX when seq_path() uses a larger seq buffer.
> */
> - while (fscanf(f, "%zx-%zx %s %zx %*s %*d%[^\n]\n",
> + while (fscanf(f, "%zx-%zx %15s %zx %*s %*d%4095[^\n]%*[^\n]\n",
It would be nice if we could somehow not hardcode the lengths to make it clear,
where we're deriving the values from, but unless I'm missing something there
the only way would be to snprintf the format string every single time which would
be overkill.
> &seg_start, &seg_end, mode, &seg_off, line) == 5) {
> void *tmp;
>
On Fri, May 22, 2026 at 1:58 PM Emil Tsalapatis <emil@etsalapatis.com> wrote: > Minor nit in case this gets respun: Can you mention that the n >= int(sizeof(line)) check also > takes care of making sure the libbpf_strlcpy below doesn't overflow path? Sure, will make a note. > It would be nice if we could somehow not hardcode the lengths to make it clear, > where we're deriving the values from, but unless I'm missing something there > the only way would be to snprintf the format string every single time which would > be overkill. My first draft patch (Claude's, really) did that but it had an ugly inner loop and like you said, possibly introduced more ways to make a mistake. I think there is a real pedantic version that could be written more elegantly with helper macros/functions that could be shared across other sites, but maybe that belongs in a separate follow up. Thanks, Mike
© 2016 - 2026 Red Hat, Inc.