From nobody Wed Jun 10 11:10:56 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76461212FAD; Mon, 4 May 2026 15:34:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908888; cv=none; b=Cf7S+LkGgbHIDnW1MVic2FhajNNb0G11UCyIAcvXtI8rERU+aikVDaySxz1Rf+CBCokhfr1BcIO9Amxiv0//9dBjqkCebPjKFHx3OSgTPxr/E4kFglGiZNUS1e2wI5nvT4LPBFrCFpA9ouXLivimXvfoMn7z/PnN6TxicIgZT+Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908888; c=relaxed/simple; bh=RLAva6tIJGLw+qmrleQUCEm9e77PvFT0fjxvBSb0AiU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=itJiHP2mUbaE7gvdODJFdTmzuEwg4rfCFEqEk5dOkgYaDVGFLZZx2OTPjb10eJeHq9wpPJgGh8do5WWCGrEqBWZMAxTzEkumklFbBfZLgZOYdjrfWHJJnjq/KT+R7niwAEcatlOhGVHFA6DGwVD6PZDsa3lH4CPWhAGusugfkH0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=u8WSBisz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="u8WSBisz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5860C2BCB8; Mon, 4 May 2026 15:34:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777908888; bh=RLAva6tIJGLw+qmrleQUCEm9e77PvFT0fjxvBSb0AiU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=u8WSBisz/c1p0hcm/NVToHijD7+VbjJaRLAgc0v/YB/jrvVGcXvSaGVb00LUpkepW cHmYKcWS3VokiCf/VQLxllhQsMjgDAOq2BGeth+XzcYLON2cX0pcao/J4jdRdQ8e0I hawKMeEsl9fv1MBNJf6x2Cylf/J/ASyu1838MK76R45CNWpbM39l10mLWE5qBw2Lso NNcT8d6tzGlwk9P675OE06Fb4sxNLsuBjpCXlRDxvulY/6lttQtXvcpDzoxoEJI6DX 1S3hyRiVBAyHK+TqC/tgatI4ZXWvdWLtC1MdPUU4yMvER4O3TVc5yXiXEjS/jwwSX9 4GbZsf9je16Dw== From: Sasha Levin To: Andrew Morton , Masahiro Yamada , Luis Chamberlain , Linus Torvalds , Richard Weinberger , Juergen Gross , Geert Uytterhoeven , James Bottomley Cc: Jonathan Corbet , Nathan Chancellor , Nicolas Schier , Petr Pavlu , Daniel Gomez , Greg KH , Petr Mladek , Steven Rostedt , Kees Cook , Peter Zijlstra , Thorsten Leemhuis , Vlastimil Babka , Helge Deller , Randy Dunlap , Laurent Pinchart , Vivian Wang , Zhen Lei , Sami Tolvanen , linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-modules@vger.kernel.org, linux-doc@vger.kernel.org, Sasha Levin Subject: [PATCH v5 1/4] kallsyms: embed source file:line info in kernel stack traces Date: Mon, 4 May 2026 11:33:57 -0400 Message-ID: <20260504153401.2416391-2-sashal@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260504153401.2416391-1-sashal@kernel.org> References: <20260504153401.2416391-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add CONFIG_KALLSYMS_LINEINFO, which embeds a compact address-to-line lookup table in the kernel image so stack traces directly print source file and line number information: root@localhost:~# echo c > /proc/sysrq-trigger [ 11.201987] sysrq: Trigger a crash [ 11.202831] Kernel panic - not syncing: sysrq triggered crash [ 11.206218] Call Trace: [ 11.206501] [ 11.206749] dump_stack_lvl+0x5d/0x80 (lib/dump_stack.c:94) [ 11.207403] vpanic+0x36e/0x620 (kernel/panic.c:650) [ 11.208565] ? __lock_acquire+0x465/0x2240 (kernel/locking/lockdep.c:4= 674) [ 11.209324] panic+0xc9/0xd0 (kernel/panic.c:787) [ 11.211873] ? find_held_lock+0x2b/0x80 (kernel/locking/lockdep.c:5350) [ 11.212597] ? lock_release+0xd3/0x300 (kernel/locking/lockdep.c:5535) [ 11.213312] sysrq_handle_crash+0x1a/0x20 (drivers/tty/sysrq.c:154) [ 11.214005] __handle_sysrq.cold+0x66/0x256 (drivers/tty/sysrq.c:611) [ 11.214712] write_sysrq_trigger+0x65/0x80 (drivers/tty/sysrq.c:1221) [ 11.215424] proc_reg_write+0x1bd/0x3c0 (fs/proc/inode.c:330) [ 11.216061] vfs_write+0x1c6/0xff0 (fs/read_write.c:686) [ 11.218848] ksys_write+0xfa/0x200 (fs/read_write.c:740) [ 11.222394] do_syscall_64+0xf3/0x690 (arch/x86/entry/syscall_64.c:63) [ 11.223942] entry_SYSCALL_64_after_hwframe+0x77/0x7f (arch/x86/entry/= entry_64.S:121) At build time, a new host tool (scripts/gen_lineinfo) reads DWARF .debug_line from vmlinux using libdw (elfutils), extracts all address-to-file:line mappings, and generates an assembly file with sorted parallel arrays (offsets from _text, file IDs, and line numbers). These are linked into vmlinux as .rodata. At runtime, kallsyms_lookup_lineinfo() does a binary search on the table and __sprint_symbol() appends "(file:line)" to each stack frame. The lookup uses offsets from _text so it works with KASLR, requires no locks or allocations, and is safe in any context including panic. The feature requires CONFIG_DEBUG_INFO (for DWARF data) and elfutils (libdw-dev) on the build host. Memory footprint measured with a 1852-option x86_64 config: Table: 4,597,583 entries from 4,841 source files lineinfo_addrs[] 4,597,583 x u32 =3D 17.5 MiB lineinfo_file_ids[] 4,597,583 x u16 =3D 8.8 MiB lineinfo_lines[] 4,597,583 x u32 =3D 17.5 MiB file_offsets + filenames ~ 0.1 MiB Total .rodata increase: ~ 44.0 MiB vmlinux (stripped): 529 MiB -> 573 MiB (+44 MiB / +8.3%) The .config used for testing is a simple KVM guest configuration for local development and testing. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sasha Levin --- Documentation/admin-guide/index.rst | 1 + .../admin-guide/kallsyms-lineinfo.rst | 72 +++ MAINTAINERS | 6 + include/linux/kallsyms.h | 17 +- init/Kconfig | 20 + kernel/kallsyms.c | 76 ++- kernel/kallsyms_internal.h | 9 + scripts/.gitignore | 1 + scripts/Makefile | 3 + scripts/empty_lineinfo.S | 30 + scripts/gen_lineinfo.c | 557 ++++++++++++++++++ scripts/kallsyms.c | 16 + scripts/link-vmlinux.sh | 43 +- 13 files changed, 841 insertions(+), 10 deletions(-) create mode 100644 Documentation/admin-guide/kallsyms-lineinfo.rst create mode 100644 scripts/empty_lineinfo.S create mode 100644 scripts/gen_lineinfo.c diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guid= e/index.rst index cd28dfe91b060..37456e08fe43c 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -73,6 +73,7 @@ problems and bugs in particular. ramoops dynamic-debug-howto init + kallsyms-lineinfo kdump/index perf/index pstore-blk diff --git a/Documentation/admin-guide/kallsyms-lineinfo.rst b/Documentatio= n/admin-guide/kallsyms-lineinfo.rst new file mode 100644 index 0000000000000..c8ec124394354 --- /dev/null +++ b/Documentation/admin-guide/kallsyms-lineinfo.rst @@ -0,0 +1,72 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Kallsyms Source Line Info (LINEINFO) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Overview +=3D=3D=3D=3D=3D=3D=3D=3D + +``CONFIG_KALLSYMS_LINEINFO`` embeds DWARF-derived source file and line num= ber +mappings into the kernel image so that stack traces include +``(file.c:123)`` annotations next to each symbol. This makes it significa= ntly +easier to pinpoint the exact source location during debugging, without nee= ding +to manually cross-reference addresses with ``addr2line``. + +Enabling the Feature +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Enable the following kernel configuration options:: + + CONFIG_KALLSYMS=3Dy + CONFIG_DEBUG_INFO=3Dy + CONFIG_KALLSYMS_LINEINFO=3Dy + +Build dependency: the host tool ``scripts/gen_lineinfo`` requires ``libdw`` +from elfutils. Install the development package: + +- Debian/Ubuntu: ``apt install libdw-dev`` +- Fedora/RHEL: ``dnf install elfutils-devel`` +- Arch Linux: ``pacman -S elfutils`` + +Example Output +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Without ``CONFIG_KALLSYMS_LINEINFO``:: + + Call Trace: + + dump_stack_lvl+0x5d/0x80 + do_syscall_64+0x82/0x190 + entry_SYSCALL_64_after_hwframe+0x76/0x7e + +With ``CONFIG_KALLSYMS_LINEINFO``:: + + Call Trace: + + dump_stack_lvl+0x5d/0x80 (lib/dump_stack.c:123) + do_syscall_64+0x82/0x190 (arch/x86/entry/common.c:52) + entry_SYSCALL_64_after_hwframe+0x76/0x7e + +Note that assembly routines (such as ``entry_SYSCALL_64_after_hwframe``) a= re +not annotated because they lack DWARF debug information. + +Memory Overhead +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The lineinfo tables are stored in ``.rodata`` and typically add approximat= ely +44 MiB to the kernel image for a standard configuration (~4.6 million DWARF +line entries, ~10 bytes per entry after deduplication). + +Known Limitations +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +- **vmlinux only**: Only symbols in the core kernel image are annotated. + Module symbols are not covered. +- **4 GiB offset limit**: Address offsets from ``_text`` are stored as 32-= bit + values. Entries beyond 4 GiB from ``_text`` are skipped at build time w= ith + a warning. +- **65535 file limit**: Source file IDs are stored as 16-bit values. Buil= ds + with more than 65535 unique source files will fail with an error. +- **No assembly annotations**: Functions implemented in assembly that lack + DWARF ``.debug_line`` data are not annotated. diff --git a/MAINTAINERS b/MAINTAINERS index 2fb1c75afd163..a683a7023f6e9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13793,6 +13793,12 @@ S: Maintained F: Documentation/hwmon/k8temp.rst F: drivers/hwmon/k8temp.c =20 +KALLSYMS LINEINFO +M: Sasha Levin +S: Maintained +F: Documentation/admin-guide/kallsyms-lineinfo.rst +F: scripts/gen_lineinfo.c + KASAN M: Andrey Ryabinin R: Alexander Potapenko diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h index d5dd54c53ace6..7d4c9dca06c87 100644 --- a/include/linux/kallsyms.h +++ b/include/linux/kallsyms.h @@ -16,10 +16,15 @@ #include =20 #define KSYM_NAME_LEN 512 + +/* Extra space for " (path/to/file.c:12345)" suffix when lineinfo is enabl= ed */ +#define KSYM_LINEINFO_LEN (IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) ? 128 : 0) + #define KSYM_SYMBOL_LEN (sizeof("%s+%#lx/%#lx [%s %s]") + \ (KSYM_NAME_LEN - 1) + \ 2*(BITS_PER_LONG*3/10) + (MODULE_NAME_LEN - 1) + \ - (BUILD_ID_SIZE_MAX * 2) + 1) + (BUILD_ID_SIZE_MAX * 2) + 1 + \ + KSYM_LINEINFO_LEN) =20 struct cred; struct module; @@ -96,6 +101,9 @@ extern int sprint_backtrace_build_id(char *buffer, unsig= ned long address); =20 int lookup_symbol_name(unsigned long addr, char *symname); =20 +bool kallsyms_lookup_lineinfo(unsigned long addr, + const char **file, unsigned int *line); + #else /* !CONFIG_KALLSYMS */ =20 static inline unsigned long kallsyms_lookup_name(const char *name) @@ -164,6 +172,13 @@ static inline int kallsyms_on_each_match_symbol(int (*= fn)(void *, unsigned long) { return -EOPNOTSUPP; } + +static inline bool kallsyms_lookup_lineinfo(unsigned long addr, + const char **file, + unsigned int *line) +{ + return false; +} #endif /*CONFIG_KALLSYMS*/ =20 static inline void print_ip_sym(const char *loglvl, unsigned long ip) diff --git a/init/Kconfig b/init/Kconfig index 2937c4d308aec..99e78c253056a 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2070,6 +2070,26 @@ config KALLSYMS_ALL =20 Say N unless you really need all symbols, or kernel live patching. =20 +config KALLSYMS_LINEINFO + bool "Embed source file:line information in stack traces" + depends on KALLSYMS && DEBUG_INFO + help + Embeds an address-to-source-line mapping table in the kernel + image so that stack traces directly include file:line information, + similar to what scripts/decode_stacktrace.sh provides but without + needing external tools or a vmlinux with debug info at runtime. + + When enabled, stack traces will look like: + + kmem_cache_alloc_noprof+0x60/0x630 (mm/slub.c:3456) + anon_vma_clone+0x2ed/0xcf0 (mm/rmap.c:412) + + This requires elfutils (libdw-dev/elfutils-devel) on the build host. + Adds approximately 44MB to a typical kernel image (10 bytes per + DWARF line-table entry, ~4.6M entries for a typical config). + + If unsure, say N. + # end of the "standard kernel features (expert users)" menu =20 config ARCH_HAS_MEMBARRIER_CALLBACKS diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index aec2f06858afd..1e3f527b13988 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -467,9 +467,56 @@ static int append_buildid(char *buffer, const char *= modname, =20 #endif /* CONFIG_STACKTRACE_BUILD_ID */ =20 +bool kallsyms_lookup_lineinfo(unsigned long addr, + const char **file, unsigned int *line) +{ + unsigned long long raw_offset; + unsigned int offset, low, high, mid, file_id; + + if (!IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) || !lineinfo_num_entries) + return false; + + /* Compute offset from _text */ + if (addr < (unsigned long)_text) + return false; + + raw_offset =3D addr - (unsigned long)_text; + if (raw_offset > UINT_MAX) + return false; + offset =3D (unsigned int)raw_offset; + + /* Binary search for largest entry <=3D offset */ + low =3D 0; + high =3D lineinfo_num_entries; + while (low < high) { + mid =3D low + (high - low) / 2; + if (lineinfo_addrs[mid] <=3D offset) + low =3D mid + 1; + else + high =3D mid; + } + + if (low =3D=3D 0) + return false; + low--; + + file_id =3D lineinfo_file_ids[low]; + *line =3D lineinfo_lines[low]; + + if (file_id >=3D lineinfo_num_files) + return false; + + if (lineinfo_file_offsets[file_id] >=3D lineinfo_filenames_size) + return false; + + *file =3D &lineinfo_filenames[lineinfo_file_offsets[file_id]]; + return true; +} + /* Look up a kernel symbol and return it in a text buffer. */ static int __sprint_symbol(char *buffer, unsigned long address, - int symbol_offset, int add_offset, int add_buildid) + int symbol_offset, int add_offset, int add_buildid, + int add_lineinfo) { char *modname; const unsigned char *buildid; @@ -497,6 +544,23 @@ static int __sprint_symbol(char *buffer, unsigned long= address, len +=3D sprintf(buffer + len, "]"); } =20 + /* + * Append "(file:line)" only for stack-backtrace consumers. Plain + * sprint_symbol() backs %ps, and many existing format strings tack + * literal "()" after %ps to indicate a function call ("foo() + * replaced with bar()"); appending lineinfo there would produce a + * confusing "foo (file:line)()". + */ + if (add_lineinfo && IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) && !modname) { + const char *li_file; + unsigned int li_line; + + if (kallsyms_lookup_lineinfo(address, + &li_file, &li_line)) + len +=3D snprintf(buffer + len, KSYM_SYMBOL_LEN - len, + " (%s:%u)", li_file, li_line); + } + return len; } =20 @@ -513,7 +577,7 @@ static int __sprint_symbol(char *buffer, unsigned long = address, */ int sprint_symbol(char *buffer, unsigned long address) { - return __sprint_symbol(buffer, address, 0, 1, 0); + return __sprint_symbol(buffer, address, 0, 1, 0, 0); } EXPORT_SYMBOL_GPL(sprint_symbol); =20 @@ -530,7 +594,7 @@ EXPORT_SYMBOL_GPL(sprint_symbol); */ int sprint_symbol_build_id(char *buffer, unsigned long address) { - return __sprint_symbol(buffer, address, 0, 1, 1); + return __sprint_symbol(buffer, address, 0, 1, 1, 0); } EXPORT_SYMBOL_GPL(sprint_symbol_build_id); =20 @@ -547,7 +611,7 @@ EXPORT_SYMBOL_GPL(sprint_symbol_build_id); */ int sprint_symbol_no_offset(char *buffer, unsigned long address) { - return __sprint_symbol(buffer, address, 0, 0, 0); + return __sprint_symbol(buffer, address, 0, 0, 0, 0); } EXPORT_SYMBOL_GPL(sprint_symbol_no_offset); =20 @@ -567,7 +631,7 @@ EXPORT_SYMBOL_GPL(sprint_symbol_no_offset); */ int sprint_backtrace(char *buffer, unsigned long address) { - return __sprint_symbol(buffer, address, -1, 1, 0); + return __sprint_symbol(buffer, address, -1, 1, 0, 1); } =20 /** @@ -587,7 +651,7 @@ int sprint_backtrace(char *buffer, unsigned long addres= s) */ int sprint_backtrace_build_id(char *buffer, unsigned long address) { - return __sprint_symbol(buffer, address, -1, 1, 1); + return __sprint_symbol(buffer, address, -1, 1, 1, 1); } =20 /* To avoid using get_symbol_offset for every symbol, we carry prefix alon= g. */ diff --git a/kernel/kallsyms_internal.h b/kernel/kallsyms_internal.h index 81a867dbe57d4..d7374ce444d81 100644 --- a/kernel/kallsyms_internal.h +++ b/kernel/kallsyms_internal.h @@ -15,4 +15,13 @@ extern const u16 kallsyms_token_index[]; extern const unsigned int kallsyms_markers[]; extern const u8 kallsyms_seqs_of_names[]; =20 +extern const u32 lineinfo_num_entries; +extern const u32 lineinfo_addrs[]; +extern const u16 lineinfo_file_ids[]; +extern const u32 lineinfo_lines[]; +extern const u32 lineinfo_num_files; +extern const u32 lineinfo_file_offsets[]; +extern const u32 lineinfo_filenames_size; +extern const char lineinfo_filenames[]; + #endif // LINUX_KALLSYMS_INTERNAL_H_ diff --git a/scripts/.gitignore b/scripts/.gitignore index 4215c2208f7e4..e175714c18b61 100644 --- a/scripts/.gitignore +++ b/scripts/.gitignore @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only /asn1_compiler +/gen_lineinfo /gen_packed_field_checks /generate_rust_target /insert-sys-cert diff --git a/scripts/Makefile b/scripts/Makefile index 3434a82a119f0..55244ce955781 100644 --- a/scripts/Makefile +++ b/scripts/Makefile @@ -4,6 +4,7 @@ # the kernel for the build process. =20 hostprogs-always-$(CONFIG_KALLSYMS) +=3D kallsyms +hostprogs-always-$(CONFIG_KALLSYMS_LINEINFO) +=3D gen_lineinfo hostprogs-always-$(BUILD_C_RECORDMCOUNT) +=3D recordmcount hostprogs-always-$(CONFIG_BUILDTIME_TABLE_SORT) +=3D sorttable hostprogs-always-$(CONFIG_ASN1) +=3D asn1_compiler @@ -37,6 +38,8 @@ HOSTCFLAGS_asn1_compiler.o =3D -I$(srctree)/include HOSTCFLAGS_sign-file.o =3D $(shell $(HOSTPKG_CONFIG) --cflags libcrypto 2>= /dev/null) HOSTCFLAGS_sign-file.o +=3D -I$(srctree)/tools/include/uapi/ HOSTLDLIBS_sign-file =3D $(shell $(HOSTPKG_CONFIG) --libs libcrypto 2> /de= v/null || echo -lcrypto) +HOSTCFLAGS_gen_lineinfo.o =3D $(shell $(HOSTPKG_CONFIG) --cflags libdw 2> = /dev/null) +HOSTLDLIBS_gen_lineinfo =3D $(shell $(HOSTPKG_CONFIG) --libs libdw 2> /dev= /null || echo -ldw -lelf -lz) =20 ifdef CONFIG_UNWINDER_ORC ifeq ($(ARCH),x86_64) diff --git a/scripts/empty_lineinfo.S b/scripts/empty_lineinfo.S new file mode 100644 index 0000000000000..e058c41137123 --- /dev/null +++ b/scripts/empty_lineinfo.S @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2026 Sasha Levin + * + * Empty lineinfo stub for the initial vmlinux link. + * The real lineinfo is generated from .tmp_vmlinux1 by gen_lineinfo. + */ + .section .rodata, "a" + .globl lineinfo_num_entries + .balign 4 +lineinfo_num_entries: + .long 0 + .globl lineinfo_num_files + .balign 4 +lineinfo_num_files: + .long 0 + .globl lineinfo_addrs +lineinfo_addrs: + .globl lineinfo_file_ids +lineinfo_file_ids: + .globl lineinfo_lines +lineinfo_lines: + .globl lineinfo_file_offsets +lineinfo_file_offsets: + .globl lineinfo_filenames_size + .balign 4 +lineinfo_filenames_size: + .long 0 + .globl lineinfo_filenames +lineinfo_filenames: diff --git a/scripts/gen_lineinfo.c b/scripts/gen_lineinfo.c new file mode 100644 index 0000000000000..699e760178f09 --- /dev/null +++ b/scripts/gen_lineinfo.c @@ -0,0 +1,557 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * gen_lineinfo.c - Generate address-to-source-line lookup tables from DWA= RF + * + * Copyright (C) 2026 Sasha Levin + * + * Reads DWARF .debug_line from a vmlinux ELF file and outputs an assembly + * file containing sorted lookup tables that the kernel uses to annotate + * stack traces with source file:line information. + * + * Requires libdw from elfutils. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static unsigned int skipped_overflow; + +/* + * vmlinux mode: end of the invariant .text region. Zero means "no cap" + * (graceful fallback when _etext is absent on some build). + */ +static unsigned long long text_end_addr; + +struct line_entry { + unsigned int offset; /* offset from _text */ + unsigned int file_id; + unsigned int line; +}; + +struct file_entry { + char *name; + unsigned int id; + unsigned int str_offset; +}; + +static struct line_entry *entries; +static unsigned int num_entries; +static unsigned int entries_capacity; + +static struct file_entry *files; +static unsigned int num_files; +static unsigned int files_capacity; + +#define FILE_HASH_BITS 13 +#define FILE_HASH_SIZE (1 << FILE_HASH_BITS) + +struct file_hash_entry { + const char *name; + unsigned int id; +}; + +static struct file_hash_entry file_hash[FILE_HASH_SIZE]; + +static unsigned int hash_str(const char *s) +{ + unsigned int h =3D 5381; + + for (; *s; s++) + h =3D h * 33 + (unsigned char)*s; + return h & (FILE_HASH_SIZE - 1); +} + +static void add_entry(unsigned int offset, unsigned int file_id, + unsigned int line) +{ + if (num_entries >=3D entries_capacity) { + entries_capacity =3D entries_capacity ? entries_capacity * 2 : 65536; + entries =3D realloc(entries, entries_capacity * sizeof(*entries)); + if (!entries) { + fprintf(stderr, "out of memory\n"); + exit(1); + } + } + entries[num_entries].offset =3D offset; + entries[num_entries].file_id =3D file_id; + entries[num_entries].line =3D line; + num_entries++; +} + +static unsigned int find_or_add_file(const char *name) +{ + unsigned int h =3D hash_str(name); + + /* Open-addressing lookup with linear probing */ + while (file_hash[h].name) { + if (!strcmp(file_hash[h].name, name)) + return file_hash[h].id; + h =3D (h + 1) & (FILE_HASH_SIZE - 1); + } + + if (num_files >=3D 65535) { + fprintf(stderr, + "gen_lineinfo: too many source files (%u > 65535)\n", + num_files); + exit(1); + } + + if (num_files >=3D files_capacity) { + files_capacity =3D files_capacity ? files_capacity * 2 : 4096; + files =3D realloc(files, files_capacity * sizeof(*files)); + if (!files) { + fprintf(stderr, "out of memory\n"); + exit(1); + } + } + files[num_files].name =3D strdup(name); + files[num_files].id =3D num_files; + + /* Insert into hash table (points to files[] entry) */ + file_hash[h].name =3D files[num_files].name; + file_hash[h].id =3D num_files; + + num_files++; + return num_files - 1; +} + +/* + * Well-known top-level directories in the kernel source tree. + * Used as a fallback to recover relative paths from absolute DWARF paths + * when comp_dir doesn't match (e.g. O=3D out-of-tree builds where comp_dir + * is the build directory but source paths point into the source tree). + */ +static const char * const kernel_dirs[] =3D { + "arch/", "block/", "certs/", "crypto/", "drivers/", "fs/", + "include/", "init/", "io_uring/", "ipc/", "kernel/", "lib/", + "mm/", "net/", "rust/", "samples/", "scripts/", "security/", + "sound/", "tools/", "usr/", "virt/", +}; + +/* + * Strip a filename to a kernel-relative path. + * + * For absolute paths, strip the comp_dir prefix (from DWARF) to get + * a kernel-tree-relative path. When that fails (e.g. O=3D builds where + * comp_dir is the build directory), scan for a well-known kernel + * top-level directory name in the path to recover the relative path. + * Fall back to the basename as a last resort. + * + * For relative paths (common in modules), libdw may produce a bogus + * doubled path like "net/foo/bar.c/net/foo/bar.c" due to ET_REL DWARF + * quirks. Detect and strip such duplicates. + */ +static const char *make_relative(const char *path, const char *comp_dir) +{ + const char *p; + + /* If already relative, use as-is */ + if (path[0] !=3D '/') + return path; + + /* comp_dir from DWARF is the most reliable method */ + if (comp_dir) { + size_t len =3D strlen(comp_dir); + + if (!strncmp(path, comp_dir, len) && path[len] =3D=3D '/') { + const char *rel =3D path + len + 1; + + /* + * If comp_dir pointed to a subdirectory + * (e.g. arch/parisc/kernel) rather than + * the tree root, stripping it leaves a + * bare filename. Fall through to the + * kernel_dirs scan so we recover the full + * relative path instead. + */ + if (strchr(rel, '/')) + return rel; + } + + /* + * comp_dir prefix didn't help =E2=80=94 either it didn't match + * or it was too specific and left a bare filename. + * Scan for a known kernel top-level directory component + * to find where the relative path starts. This handles + * O=3D builds and arches where comp_dir is a subdirectory. + */ + for (p =3D path + 1; *p; p++) { + if (*(p - 1) =3D=3D '/') { + for (unsigned int i =3D 0; i < sizeof(kernel_dirs) / + sizeof(kernel_dirs[0]); i++) { + if (!strncmp(p, kernel_dirs[i], + strlen(kernel_dirs[i]))) + return p; + } + } + } + + /* Fall back to basename */ + p =3D strrchr(path, '/'); + return p ? p + 1 : path; + } + + /* Fall back to basename */ + p =3D strrchr(path, '/'); + return p ? p + 1 : path; +} + +static int compare_entries(const void *a, const void *b) +{ + const struct line_entry *ea =3D a; + const struct line_entry *eb =3D b; + + if (ea->offset !=3D eb->offset) + return ea->offset < eb->offset ? -1 : 1; + if (ea->file_id !=3D eb->file_id) + return ea->file_id < eb->file_id ? -1 : 1; + if (ea->line !=3D eb->line) + return ea->line < eb->line ? -1 : 1; + return 0; +} + +/* + * Look up a vmlinux symbol by exact name and return its st_value, or + * @fallback if absent. Aborts when @required and the symbol is missing. + */ +static unsigned long long find_vmlinux_sym(Elf *elf, const char *name, + unsigned long long fallback, + bool required) +{ + size_t nsyms, i; + Elf_Scn *scn =3D NULL; + GElf_Shdr shdr; + + while ((scn =3D elf_nextscn(elf, scn)) !=3D NULL) { + Elf_Data *data; + + if (!gelf_getshdr(scn, &shdr)) + continue; + if (shdr.sh_type !=3D SHT_SYMTAB) + continue; + + data =3D elf_getdata(scn, NULL); + if (!data) + continue; + + nsyms =3D shdr.sh_size / shdr.sh_entsize; + for (i =3D 0; i < nsyms; i++) { + GElf_Sym sym; + const char *sname; + + if (!gelf_getsym(data, i, &sym)) + continue; + sname =3D elf_strptr(elf, shdr.sh_link, sym.st_name); + if (sname && !strcmp(sname, name)) + return sym.st_value; + } + } + + if (required) { + fprintf(stderr, "Cannot find %s symbol\n", name); + exit(1); + } + return fallback; +} + +static unsigned long long find_text_addr(Elf *elf) +{ + return find_vmlinux_sym(elf, "_text", 0, true); +} + +/* + * vmlinux is linked in multiple passes: gen_lineinfo runs against + * .tmp_vmlinux1 (which carries an empty lineinfo stub), then real tables + * are linked in for the final image. Sections placed AFTER .rodata + * (.init.text, .exit.text, ...) shift forward as .rodata grows to hold + * the real lineinfo blob, so DWARF addresses we'd capture for them in + * pass 1 would be stale in the final kernel. Cap captured addresses at + * _etext, the symbol that marks the end of .text =E2=80=94 placed before = .rodata + * in every architecture's vmlinux.lds.S, so its addresses are invariant + * across the relink. Returns 0 if _etext is absent (no cap; v3 behavior). + */ +static unsigned long long find_text_end_addr(Elf *elf) +{ + return find_vmlinux_sym(elf, "_etext", 0, false); +} + +static void process_dwarf(Dwarf *dwarf, unsigned long long text_addr) +{ + Dwarf_Off off =3D 0, next_off; + size_t hdr_size; + + while (dwarf_nextcu(dwarf, off, &next_off, &hdr_size, + NULL, NULL, NULL) =3D=3D 0) { + Dwarf_Die cudie; + Dwarf_Lines *lines; + size_t nlines; + Dwarf_Attribute attr; + const char *comp_dir =3D NULL; + + if (!dwarf_offdie(dwarf, off + hdr_size, &cudie)) + goto next; + + if (dwarf_attr(&cudie, DW_AT_comp_dir, &attr)) + comp_dir =3D dwarf_formstring(&attr); + + if (dwarf_getsrclines(&cudie, &lines, &nlines) !=3D 0) + goto next; + + for (size_t i =3D 0; i < nlines; i++) { + Dwarf_Line *line =3D dwarf_onesrcline(lines, i); + Dwarf_Addr addr; + const char *src; + const char *rel; + unsigned int file_id, loffset; + int lineno; + + if (!line) + continue; + + if (dwarf_lineaddr(line, &addr) !=3D 0) + continue; + if (dwarf_lineno(line, &lineno) !=3D 0) + continue; + if (lineno =3D=3D 0) + continue; + + src =3D dwarf_linesrc(line, NULL, NULL); + if (!src) + continue; + + if (addr < text_addr) + continue; + /* + * Skip addresses past _etext. Sections after .rodata + * shift when the real lineinfo replaces the empty stub + * during the multi-pass vmlinux link, so any address + * we'd capture there would be stale by the time the + * final kernel runs. + */ + if (text_end_addr && addr >=3D text_end_addr) + continue; + + { + unsigned long long raw_offset =3D addr - text_addr; + + if (raw_offset > UINT_MAX) { + skipped_overflow++; + continue; + } + loffset =3D (unsigned int)raw_offset; + } + + rel =3D make_relative(src, comp_dir); + file_id =3D find_or_add_file(rel); + + add_entry(loffset, file_id, (unsigned int)lineno); + } +next: + off =3D next_off; + } +} + +static void deduplicate(void) +{ + unsigned int i, j; + + if (num_entries < 2) + return; + + /* Sort by offset, then file_id, then line for stability */ + qsort(entries, num_entries, sizeof(*entries), compare_entries); + + /* + * Remove duplicate entries: + * - Same offset: keep first (deterministic from stable sort keys) + * - Same file:line as previous kept entry: redundant for binary + * search -- any address between them resolves to the earlier one + */ + j =3D 0; + for (i =3D 1; i < num_entries; i++) { + if (entries[i].offset =3D=3D entries[j].offset) + continue; + if (entries[i].file_id =3D=3D entries[j].file_id && + entries[i].line =3D=3D entries[j].line) + continue; + j++; + if (j !=3D i) + entries[j] =3D entries[i]; + } + num_entries =3D j + 1; +} + +static void compute_file_offsets(void) +{ + unsigned int offset =3D 0; + + for (unsigned int i =3D 0; i < num_files; i++) { + files[i].str_offset =3D offset; + offset +=3D strlen(files[i].name) + 1; + } +} + +static void print_escaped_asciz(const char *s) +{ + printf("\t.asciz \""); + for (; *s; s++) { + if (*s =3D=3D '"' || *s =3D=3D '\\') + putchar('\\'); + putchar(*s); + } + printf("\"\n"); +} + +static void output_assembly(void) +{ + printf("/* SPDX-License-Identifier: GPL-2.0 */\n"); + printf("/*\n"); + printf(" * Automatically generated by scripts/gen_lineinfo\n"); + printf(" * Do not edit.\n"); + printf(" */\n\n"); + + printf("\t.section .rodata, \"a\"\n\n"); + + /* Number of entries */ + printf("\t.globl lineinfo_num_entries\n"); + printf("\t.balign 4\n"); + printf("lineinfo_num_entries:\n"); + printf("\t.long %u\n\n", num_entries); + + /* Number of files */ + printf("\t.globl lineinfo_num_files\n"); + printf("\t.balign 4\n"); + printf("lineinfo_num_files:\n"); + printf("\t.long %u\n\n", num_files); + + /* Sorted address offsets from _text */ + printf("\t.globl lineinfo_addrs\n"); + printf("\t.balign 4\n"); + printf("lineinfo_addrs:\n"); + for (unsigned int i =3D 0; i < num_entries; i++) + printf("\t.long 0x%x\n", entries[i].offset); + printf("\n"); + + /* File IDs, parallel to addrs (u16 -- supports up to 65535 files) */ + printf("\t.globl lineinfo_file_ids\n"); + printf("\t.balign 2\n"); + printf("lineinfo_file_ids:\n"); + for (unsigned int i =3D 0; i < num_entries; i++) + printf("\t.short %u\n", entries[i].file_id); + printf("\n"); + + /* Line numbers, parallel to addrs */ + printf("\t.globl lineinfo_lines\n"); + printf("\t.balign 4\n"); + printf("lineinfo_lines:\n"); + for (unsigned int i =3D 0; i < num_entries; i++) + printf("\t.long %u\n", entries[i].line); + printf("\n"); + + /* File string offset table */ + printf("\t.globl lineinfo_file_offsets\n"); + printf("\t.balign 4\n"); + printf("lineinfo_file_offsets:\n"); + for (unsigned int i =3D 0; i < num_files; i++) + printf("\t.long %u\n", files[i].str_offset); + printf("\n"); + + /* Filenames size */ + { + unsigned int fsize =3D 0; + + for (unsigned int i =3D 0; i < num_files; i++) + fsize +=3D strlen(files[i].name) + 1; + printf("\t.globl lineinfo_filenames_size\n"); + printf("\t.balign 4\n"); + printf("lineinfo_filenames_size:\n"); + printf("\t.long %u\n\n", fsize); + } + + /* Concatenated NUL-terminated filenames */ + printf("\t.globl lineinfo_filenames\n"); + printf("lineinfo_filenames:\n"); + for (unsigned int i =3D 0; i < num_files; i++) + print_escaped_asciz(files[i].name); + printf("\n"); +} + +int main(int argc, char *argv[]) +{ + int fd; + Elf *elf; + Dwarf *dwarf; + unsigned long long text_addr; + + if (argc !=3D 2) { + fprintf(stderr, "Usage: %s \n", argv[0]); + return 1; + } + + fd =3D open(argv[1], O_RDONLY); + if (fd < 0) { + fprintf(stderr, "Cannot open %s: %s\n", argv[1], + strerror(errno)); + return 1; + } + + elf_version(EV_CURRENT); + elf =3D elf_begin(fd, ELF_C_READ, NULL); + if (!elf) { + fprintf(stderr, "elf_begin failed: %s\n", + elf_errmsg(elf_errno())); + close(fd); + return 1; + } + + text_addr =3D find_text_addr(elf); + text_end_addr =3D find_text_end_addr(elf); + + dwarf =3D dwarf_begin_elf(elf, DWARF_C_READ, NULL); + if (!dwarf) { + fprintf(stderr, "dwarf_begin_elf failed: %s\n", + dwarf_errmsg(dwarf_errno())); + fprintf(stderr, "Is %s built with CONFIG_DEBUG_INFO?\n", + argv[1]); + elf_end(elf); + close(fd); + return 1; + } + + process_dwarf(dwarf, text_addr); + + if (skipped_overflow) + fprintf(stderr, + "lineinfo: warning: %u entries skipped (offset > 4 GiB from _text)\n", + skipped_overflow); + + deduplicate(); + compute_file_offsets(); + + fprintf(stderr, "lineinfo: %u entries, %u files\n", + num_entries, num_files); + + output_assembly(); + + dwarf_end(dwarf); + elf_end(elf); + close(fd); + + /* Cleanup */ + free(entries); + for (unsigned int i =3D 0; i < num_files; i++) + free(files[i].name); + free(files); + + return 0; +} diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 37d5c095ad22a..42662c4fbc6c9 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -78,6 +78,17 @@ static char *sym_name(const struct sym_entry *s) =20 static bool is_ignored_symbol(const char *name, char type) { + /* Ignore lineinfo symbols for kallsyms pass stability */ + static const char * const lineinfo_syms[] =3D { + "lineinfo_addrs", + "lineinfo_file_ids", + "lineinfo_file_offsets", + "lineinfo_filenames", + "lineinfo_lines", + "lineinfo_num_entries", + "lineinfo_num_files", + }; + if (type =3D=3D 'u' || type =3D=3D 'n') return true; =20 @@ -90,6 +101,11 @@ static bool is_ignored_symbol(const char *name, char ty= pe) return true; } =20 + for (size_t i =3D 0; i < ARRAY_SIZE(lineinfo_syms); i++) { + if (!strcmp(name, lineinfo_syms[i])) + return true; + } + return false; } =20 diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index f99e196abeea4..39ca44fbb259b 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -103,7 +103,7 @@ vmlinux_link() ${ld} ${ldflags} -o ${output} \ ${wl}--whole-archive ${objs} ${wl}--no-whole-archive \ ${wl}--start-group ${libs} ${wl}--end-group \ - ${kallsymso} ${btf_vmlinux_bin_o} ${arch_vmlinux_o} ${ldlibs} + ${kallsymso} ${lineinfo_o} ${btf_vmlinux_bin_o} ${arch_vmlinux_o} ${ldli= bs} } =20 # Create ${2}.o file with all symbols from the ${1} object file @@ -129,6 +129,26 @@ kallsyms() kallsymso=3D${2}.o } =20 +# Generate lineinfo tables from DWARF debug info in a temporary vmlinux. +# ${1} - temporary vmlinux with debug info +# Output: sets lineinfo_o to the generated .o file +gen_lineinfo() +{ + info LINEINFO .tmp_lineinfo.S + if ! scripts/gen_lineinfo "${1}" > .tmp_lineinfo.S; then + echo >&2 "Failed to generate lineinfo from ${1}" + echo >&2 "Try to disable CONFIG_KALLSYMS_LINEINFO" + exit 1 + fi + + info AS .tmp_lineinfo.o + ${CC} ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS} \ + ${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \ + -c -o .tmp_lineinfo.o .tmp_lineinfo.S + + lineinfo_o=3D.tmp_lineinfo.o +} + # Perform kallsyms for the given temporary vmlinux. sysmap_and_kallsyms() { @@ -155,6 +175,7 @@ sorttable() cleanup() { rm -f .btf.* + rm -f .tmp_lineinfo.* rm -f .tmp_vmlinux.nm-sort rm -f System.map rm -f vmlinux @@ -183,6 +204,7 @@ fi btf_vmlinux_bin_o=3D btfids_vmlinux=3D kallsymso=3D +lineinfo_o=3D strip_debug=3D generate_map=3D =20 @@ -198,10 +220,21 @@ if is_enabled CONFIG_KALLSYMS; then kallsyms .tmp_vmlinux0.syms .tmp_vmlinux0.kallsyms fi =20 +if is_enabled CONFIG_KALLSYMS_LINEINFO; then + # Assemble an empty lineinfo stub for the initial link. + # The real lineinfo is generated from .tmp_vmlinux1 by gen_lineinfo. + ${CC} ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS} \ + ${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \ + -c -o .tmp_lineinfo.o "${srctree}/scripts/empty_lineinfo.S" + lineinfo_o=3D.tmp_lineinfo.o +fi + if is_enabled CONFIG_KALLSYMS || is_enabled CONFIG_DEBUG_INFO_BTF; then =20 - # The kallsyms linking does not need debug symbols, but the BTF does. - if ! is_enabled CONFIG_DEBUG_INFO_BTF; then + # The kallsyms linking does not need debug symbols, but BTF and + # lineinfo generation do. + if ! is_enabled CONFIG_DEBUG_INFO_BTF && + ! is_enabled CONFIG_KALLSYMS_LINEINFO; then strip_debug=3D1 fi =20 @@ -219,6 +252,10 @@ if is_enabled CONFIG_DEBUG_INFO_BTF; then btfids_vmlinux=3D.tmp_vmlinux1.BTF_ids fi =20 +if is_enabled CONFIG_KALLSYMS_LINEINFO; then + gen_lineinfo .tmp_vmlinux1 +fi + if is_enabled CONFIG_KALLSYMS; then =20 # kallsyms support --=20 2.53.0 From nobody Wed Jun 10 11:10:56 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50C8F3E0245; Mon, 4 May 2026 15:34:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908895; cv=none; b=LF2/e5nSU7Y+/aezqicrFO5S1Ic1KsuwDU9N8yUqgI/Yy7oCmfjQMmArZCXwE7AyWSHnwEcTfh7ADXULliFaxCOcLPyCmcacYWKp9Wthz1eGL6C6drVGPEMZomDujGhOAzvaZw97OSY3S6djhrAkkllgfCl+oZ3/+LVUbtBMlAo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908895; c=relaxed/simple; bh=AMTIyJWX16w7IJUGxsbF61qGVVwm/vyGXeuoJAlmpdY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=LTz/DvaIUV8JIGOEisSjzYhvvDo7OCStqX+Ojv647zJKVLfhFUkdhFP7ugWWvxA4JrobJRZ2kzf6jZ5bZ6sdQJklBB0s0RGIs2Dnl8Y/2zmuLKVxajFvyN5TSMvrl7zDO6t15HrdiGIbJNNZKQaPYG2y1+R2H1yCKVp+YXAQ2II= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uKfYGH5b; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uKfYGH5b" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AB42AC2BCC4; Mon, 4 May 2026 15:34:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777908895; bh=AMTIyJWX16w7IJUGxsbF61qGVVwm/vyGXeuoJAlmpdY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uKfYGH5bwR0HKaVn8/San1kDARwn393TU+1bivwQGlbuQKqIp+UzSyOgGGPe85Dwl yitr/zdspapZfmjBm4qL2bcrRxWFAD31HnJO5y5oKSWqKd1w5z6CphgrWo4rSC/NbN m9EAAuSdxLEMHY/Vqa08HBvs6sgrpcS3xBLHuavQYylHxFGQp+HQxEesyWfbhogmiY NV1MAUr+az3lE86zNOITa1/HKSO3F7VhWEN1QZaQc5wX/Z/n3qvrG9UbPft4KdUaYR /+BfKI9CRGzYEX1T/7ObytZLNUIpYTjSHJva6yniW4I8ch3ftX92GEqWZnogdD383Z rgFewhkKGK6iA== From: Sasha Levin To: Andrew Morton , Masahiro Yamada , Luis Chamberlain , Linus Torvalds , Richard Weinberger , Juergen Gross , Geert Uytterhoeven , James Bottomley Cc: Jonathan Corbet , Nathan Chancellor , Nicolas Schier , Petr Pavlu , Daniel Gomez , Greg KH , Petr Mladek , Steven Rostedt , Kees Cook , Peter Zijlstra , Thorsten Leemhuis , Vlastimil Babka , Helge Deller , Randy Dunlap , Laurent Pinchart , Vivian Wang , Zhen Lei , Sami Tolvanen , linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-modules@vger.kernel.org, linux-doc@vger.kernel.org, Sasha Levin Subject: [PATCH v5 2/4] kallsyms: extend lineinfo to loadable modules Date: Mon, 4 May 2026 11:33:58 -0400 Message-ID: <20260504153401.2416391-3-sashal@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260504153401.2416391-1-sashal@kernel.org> References: <20260504153401.2416391-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add CONFIG_KALLSYMS_LINEINFO_MODULES, which extends the CONFIG_KALLSYMS_LINEINFO feature to loadable kernel modules. At build time, each .ko is post-processed by scripts/gen-mod-lineinfo.sh (modeled on gen-btf.sh) which runs scripts/gen_lineinfo --module on the .ko, generates per-section .mod_lineinfo and .init.mod_lineinfo sections containing compact binary tables of section-relative offsets, file IDs, line numbers, and filenames, and embeds them back into the .ko via a partial link (ld -r). At runtime, module_lookup_lineinfo() walks the section descriptors in each blob, finds the one whose runtime range contains the queried address, and binary-searches that section's table. The lookup is NMI/panic-safe (no locks, no allocations) =E2=80=94 the data lives in read-only module memory and is freed automatically when the module (or its init memory) is unloaded. The gen_lineinfo tool gains --module mode which: - Walks an allowlist of text-like sections (.text, .exit.text, .init.text), gating each on its presence in the .ko. - Uses an ELF relocation against each covered section's symbol as the runtime "anchor", resolved by the module loader's standard apply_relocations() pass =E2=80=94 no implicit base derivation from mod->mem[].base, no special-cased loader logic. - Disambiguates DWARF addresses across sections that all share sh_addr =3D=3D 0 in ET_REL files via per-section synthetic biases applied inside apply_debug_line_relocations() (handles both abs32 and abs64 width relocs). - Handles libdw's ET_REL path-doubling quirk in make_relative(). - Declares empty section stanzas in its output assembly so the resulting lineinfo.o has LOCAL SECTION symbols rather than GLOBAL UND ones; otherwise ld -r would not bind the relocation to the .ko's existing section symbol of the same name and depmod would warn. The build pipeline runs gen-mod-lineinfo.sh after the existing modfinal step: gen_lineinfo --module ${KO} > ${KO}.lineinfo.S ${CC} -c -o ${KO}.lineinfo.o ${KO}.lineinfo.S ${LD} -r ${KO}.lineinfo.o ${KO} -o ${KO}.tmp && mv ${KO}.tmp ${KO} Order matters: lineinfo.o must come first so its zero-byte text contributions stay at offset 0 of the merged sections. The init blob lives in MOD_INIT_RODATA and is revoked via WRITE_ONCE in do_init_module() before do_free_init() releases the memory; the module_init_lineinfo_data() reader uses READ_ONCE so concurrent lookups either see the old pointer (still valid until do_free_init's synchronize_rcu) or NULL. The struct module fields are guarded by #ifdef CONFIG_KALLSYMS_LINEINFO_MODULES and accessed through inline reader accessors so callers don't duplicate the guard. Per-module overhead is approximately 14 bytes per DWARF line entry plus a small fixed cost per covered section descriptor. The next patch in this series delta-compresses the per-section streams to ~3-4 bytes per entry. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sasha Levin --- .../admin-guide/kallsyms-lineinfo.rst | 41 +- MAINTAINERS | 108 ++- include/linux/mod_lineinfo.h | 104 +++ include/linux/module.h | 39 + init/Kconfig | 19 +- kernel/kallsyms.c | 18 +- kernel/module/kallsyms.c | 181 ++++ kernel/module/main.c | 26 + scripts/Makefile.modfinal | 6 + scripts/gen-mod-lineinfo.sh | 50 + scripts/gen_lineinfo.c | 854 ++++++++++++++++-- 11 files changed, 1320 insertions(+), 126 deletions(-) create mode 100644 include/linux/mod_lineinfo.h create mode 100644 scripts/gen-mod-lineinfo.sh diff --git a/Documentation/admin-guide/kallsyms-lineinfo.rst b/Documentatio= n/admin-guide/kallsyms-lineinfo.rst index c8ec124394354..dd264830c8d5b 100644 --- a/Documentation/admin-guide/kallsyms-lineinfo.rst +++ b/Documentation/admin-guide/kallsyms-lineinfo.rst @@ -51,22 +51,47 @@ With ``CONFIG_KALLSYMS_LINEINFO``:: Note that assembly routines (such as ``entry_SYSCALL_64_after_hwframe``) a= re not annotated because they lack DWARF debug information. =20 +Module Support +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +``CONFIG_KALLSYMS_LINEINFO_MODULES`` extends the feature to loadable kernel +modules. When enabled, each ``.ko`` is post-processed at build time to em= bed +a ``.mod_lineinfo`` section containing the same kind of address-to-source +mapping. + +Enable in addition to the base options:: + + CONFIG_MODULES=3Dy + CONFIG_KALLSYMS_LINEINFO_MODULES=3Dy + +Stack traces from module code will then include annotations:: + + my_driver_func+0x30/0x100 [my_driver] (drivers/foo/bar.c:123) + +The ``.mod_lineinfo`` section is loaded into read-only module memory along= side +the module text. No additional runtime memory allocation is required; the= data +is freed when the module is unloaded. + Memory Overhead =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 -The lineinfo tables are stored in ``.rodata`` and typically add approximat= ely -44 MiB to the kernel image for a standard configuration (~4.6 million DWARF -line entries, ~10 bytes per entry after deduplication). +The vmlinux lineinfo tables are stored in ``.rodata`` and typically add +approximately 10-15 MiB to the kernel image for a standard configuration +(~4.6 million DWARF line entries, ~2-3 bytes per entry after delta +compression). + +Per-module lineinfo adds approximately 2-3 bytes per DWARF line entry to e= ach +``.ko`` file. =20 Known Limitations =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 -- **vmlinux only**: Only symbols in the core kernel image are annotated. - Module symbols are not covered. -- **4 GiB offset limit**: Address offsets from ``_text`` are stored as 32-= bit - values. Entries beyond 4 GiB from ``_text`` are skipped at build time w= ith - a warning. +- **4 GiB offset limit**: Address offsets from ``_text`` (vmlinux) or + ``.text`` base (modules) are stored as 32-bit values. Entries beyond + 4 GiB are skipped at build time with a warning. - **65535 file limit**: Source file IDs are stored as 16-bit values. Buil= ds with more than 65535 unique source files will fail with an error. - **No assembly annotations**: Functions implemented in assembly that lack DWARF ``.debug_line`` data are not annotated. +- **No init text**: For modules, functions in ``.init.text`` are not annot= ated + because that memory is freed after module initialization. diff --git a/MAINTAINERS b/MAINTAINERS index a683a7023f6e9..f264e763d1041 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7873,7 +7873,7 @@ F: drivers/gpu/drm/sun4i/sun8i* =20 DRM DRIVER FOR APPLE TOUCH BARS M: Aun-Ali Zaidi -M: Aditya Garg +M: Aditya Garg L: dri-devel@lists.freedesktop.org S: Maintained T: git https://gitlab.freedesktop.org/drm/misc/kernel.git @@ -13797,6 +13797,9 @@ KALLSYMS LINEINFO M: Sasha Levin S: Maintained F: Documentation/admin-guide/kallsyms-lineinfo.rst +F: include/linux/mod_lineinfo.h +F: lib/tests/lineinfo_kunit.c +F: scripts/gen-mod-lineinfo.sh F: scripts/gen_lineinfo.c =20 KASAN @@ -13866,7 +13869,6 @@ M: Pratyush Yadav R: Dave Young L: kexec@lists.infradead.org S: Maintained -W: http://lse.sourceforge.net/kdump/ F: Documentation/admin-guide/kdump/ F: fs/proc/vmcore.c F: include/linux/crash_core.h @@ -15258,7 +15260,7 @@ M: Andrea Cervesato M: Cyril Hrubis M: Jan Stancek M: Petr Vorel -M: Li Wang +M: Li Wang M: Yang Xu M: Xiao Yang L: ltp@lists.linux.it (subscribers-only) @@ -15405,7 +15407,7 @@ F: include/net/netns/mctp.h F: net/mctp/ =20 MAPLE TREE -M: Liam R. Howlett +M: Liam R. Howlett R: Alice Ryhl R: Andrew Ballance L: maple-tree@lists.infradead.org @@ -16765,7 +16767,7 @@ MEMORY MANAGEMENT - CORE M: Andrew Morton M: David Hildenbrand R: Lorenzo Stoakes -R: Liam R. Howlett +R: Liam R. Howlett R: Vlastimil Babka R: Mike Rapoport R: Suren Baghdasaryan @@ -16811,7 +16813,7 @@ F: mm/sparse.c F: mm/util.c F: mm/vmpressure.c F: mm/vmstat.c -N: include/linux/page[-_]* +N: include\/linux\/page[-_][a-zA-Z]* =20 MEMORY MANAGEMENT - EXECMEM M: Andrew Morton @@ -16901,7 +16903,7 @@ MEMORY MANAGEMENT - MISC M: Andrew Morton M: David Hildenbrand R: Lorenzo Stoakes -R: Liam R. Howlett +R: Liam R. Howlett R: Vlastimil Babka R: Mike Rapoport R: Suren Baghdasaryan @@ -16968,6 +16970,7 @@ S: Maintained F: include/linux/compaction.h F: include/linux/gfp.h F: include/linux/page-isolation.h +F: include/linux/pageblock-flags.h F: mm/compaction.c F: mm/debug_page_alloc.c F: mm/debug_page_ref.c @@ -16989,7 +16992,7 @@ M: Andrew Morton M: Johannes Weiner R: David Hildenbrand R: Michal Hocko -R: Qi Zheng +R: Qi Zheng R: Shakeel Butt R: Lorenzo Stoakes L: linux-mm@kvack.org @@ -17002,7 +17005,7 @@ M: Andrew Morton M: David Hildenbrand M: Lorenzo Stoakes R: Rik van Riel -R: Liam R. Howlett +R: Liam R. Howlett R: Vlastimil Babka R: Harry Yoo R: Jann Horn @@ -17049,7 +17052,7 @@ M: David Hildenbrand M: Lorenzo Stoakes R: Zi Yan R: Baolin Wang -R: Liam R. Howlett +R: Liam R. Howlett R: Nico Pache R: Ryan Roberts R: Dev Jain @@ -17087,7 +17090,7 @@ F: tools/testing/selftests/mm/uffd-*.[ch] MEMORY MANAGEMENT - RUST M: Alice Ryhl R: Lorenzo Stoakes -R: Liam R. Howlett +R: Liam R. Howlett L: linux-mm@kvack.org L: rust-for-linux@vger.kernel.org S: Maintained @@ -17101,7 +17104,7 @@ F: rust/kernel/page.rs =20 MEMORY MAPPING M: Andrew Morton -M: Liam R. Howlett +M: Liam R. Howlett M: Lorenzo Stoakes R: Vlastimil Babka R: Jann Horn @@ -17133,7 +17136,7 @@ F: tools/testing/vma/ MEMORY MAPPING - LOCKING M: Andrew Morton M: Suren Baghdasaryan -M: Liam R. Howlett +M: Liam R. Howlett M: Lorenzo Stoakes R: Vlastimil Babka R: Shakeel Butt @@ -17148,7 +17151,7 @@ F: mm/mmap_lock.c =20 MEMORY MAPPING - MADVISE (MEMORY ADVICE) M: Andrew Morton -M: Liam R. Howlett +M: Liam R. Howlett M: Lorenzo Stoakes M: David Hildenbrand R: Vlastimil Babka @@ -18678,19 +18681,59 @@ F: net/xfrm/ F: tools/testing/selftests/net/ipsec.c =20 NETWORKING [IPv4/IPv6] -M: "David S. Miller" M: David Ahern +M: Ido Schimmel L: netdev@vger.kernel.org S: Maintained -T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git -F: arch/x86/net/* -F: include/linux/ip.h -F: include/linux/ipv6* +F: Documentation/netlink/specs/rt-addr.yaml +F: Documentation/netlink/specs/rt-neigh.yaml +F: Documentation/netlink/specs/rt-route.yaml +F: Documentation/netlink/specs/rt-rule.yaml +F: include/linux/inetdevice.h +F: include/linux/mroute* +F: include/net/addrconf.h +F: include/net/arp.h F: include/net/fib* +F: include/net/if_inet6.h +F: include/net/inetpeer.h F: include/net/ip* +F: include/net/lwtunnel.h +F: include/net/ndisc.h +F: include/net/netns/nexthop.h +F: include/net/nexthop.h F: include/net/route.h -F: net/ipv4/ -F: net/ipv6/ +F: include/uapi/linux/fib_rules.h +F: include/uapi/linux/in_route.h +F: include/uapi/linux/mroute* +F: include/uapi/linux/nexthop.h +F: net/core/fib* +F: net/core/lwtunnel.c +F: net/ipv4/arp.c +F: net/ipv4/devinet.c +F: net/ipv4/fib* +F: net/ipv4/icmp.c +F: net/ipv4/igmp.c +F: net/ipv4/inet_fragment.c +F: net/ipv4/inetpeer.c +F: net/ipv4/ip* +F: net/ipv4/metrics.c +F: net/ipv4/netlink.c +F: net/ipv4/nexthop.c +F: net/ipv4/route.c +F: net/ipv6/addr* +F: net/ipv6/anycast.c +F: net/ipv6/exthdrs.c +F: net/ipv6/exthdrs_core.c +F: net/ipv6/fib* +F: net/ipv6/icmp.c +F: net/ipv6/ip* +F: net/ipv6/mcast* +F: net/ipv6/ndisc.c +F: net/ipv6/output_core.c +F: net/ipv6/reassembly.c +F: net/ipv6/route.c +F: tools/testing/selftests/net/fib* +F: tools/testing/selftests/net/forwarding/ =20 NETWORKING [LABELED] (NetLabel, Labeled IPsec, SECMARK) M: Paul Moore @@ -18825,18 +18868,11 @@ F: Documentation/networking/net_failover.rst F: drivers/net/net_failover.c F: include/net/net_failover.h =20 -NEXTHOP -M: David Ahern -L: netdev@vger.kernel.org -S: Maintained -F: include/net/netns/nexthop.h -F: include/net/nexthop.h -F: include/uapi/linux/nexthop.h -F: net/ipv4/nexthop.c - NFC SUBSYSTEM -L: netdev@vger.kernel.org -S: Orphan +M: David Heidelberg +L: oe-linux-nfc@lists.linux.dev +S: Maintained +T: git https://codeberg.org/linux-nfc/linux.git F: Documentation/devicetree/bindings/net/nfc/ F: drivers/nfc/ F: include/net/nfc/ @@ -20780,6 +20816,7 @@ M: Dominik Brodowski S: Odd Fixes T: git git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux.git F: Documentation/pcmcia/ +F: drivers/net/ethernet/8390/pcnet_cs.c F: drivers/pcmcia/ F: include/pcmcia/ F: tools/pcmcia/ @@ -23375,7 +23412,7 @@ RUST [ALLOC] M: Danilo Krummrich R: Lorenzo Stoakes R: Vlastimil Babka -R: Liam R. Howlett +R: Liam R. Howlett R: Uladzislau Rezki L: rust-for-linux@vger.kernel.org S: Maintained @@ -23527,7 +23564,7 @@ F: drivers/s390/net/ =20 S390 PCI SUBSYSTEM M: Niklas Schnelle -M: Gerald Schaefer +M: Gerd Bayer L: linux-s390@vger.kernel.org S: Supported F: Documentation/arch/s390/pci.rst @@ -24320,7 +24357,7 @@ F: include/media/i2c/rj54n1cb0c.h SHRINKER M: Andrew Morton M: Dave Chinner -R: Qi Zheng +R: Qi Zheng R: Roman Gushchin R: Muchun Song L: linux-mm@kvack.org @@ -24770,6 +24807,7 @@ SOFTWARE RAID (Multiple Disks) SUPPORT M: Song Liu M: Yu Kuai R: Li Nan +R: Xiao Ni L: linux-raid@vger.kernel.org S: Supported Q: https://patchwork.kernel.org/project/linux-raid/list/ diff --git a/include/linux/mod_lineinfo.h b/include/linux/mod_lineinfo.h new file mode 100644 index 0000000000000..9cda3263a0784 --- /dev/null +++ b/include/linux/mod_lineinfo.h @@ -0,0 +1,104 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * mod_lineinfo.h - Binary format for per-module source line information + * + * This header defines the layout of the .mod_lineinfo and + * .init.mod_lineinfo sections embedded in loadable kernel modules. It + * is dual-use: included from both the kernel and the userspace + * gen_lineinfo tool. + * + * Top-level layout (all values in target-native endianness): + * + * struct mod_lineinfo_root + * struct mod_lineinfo_section sections[hdr.num_sections] + * ... per-section sub-tables, each pointed at by sections[i].table_offs= et + * + * Each mod_lineinfo_section descriptor identifies one ELF text section + * covered by the lineinfo blob. Its .anchor field is an ELF relocation + * resolved at module-load time to the runtime base of the named section, + * eliminating the need to derive the base from mod->mem[].base segments. + * If the relocation fails to resolve (e.g. unknown reloc type), .anchor + * stays zero and lookups silently degrade to "no annotation". + * + * Each per-section sub-table is laid out as a stand-alone + * mod_lineinfo_header followed by parallel arrays: + * + * struct mod_lineinfo_header (16 bytes) + * u32 addrs[num_entries] -- offsets from this section's base, s= orted + * u16 file_ids[num_entries] -- parallel to addrs + * <2-byte pad if num_entries is odd> + * u32 lines[num_entries] -- parallel to addrs + * u32 file_offsets[num_files] -- byte offset into filenames[] + * char filenames[filenames_size] -- concatenated NUL-terminated strings + */ +#ifndef _LINUX_MOD_LINEINFO_H +#define _LINUX_MOD_LINEINFO_H + +#ifdef __KERNEL__ +#include +#else +#include +typedef uint32_t u32; +typedef uint16_t u16; +typedef uint64_t u64; +#endif + +/* + * Per-section descriptor. One entry per ELF text section covered by the + * blob (.text, .exit.text, .init.text, ...). + */ +struct mod_lineinfo_section { + u64 anchor; /* RELOC: runtime base of covered section, or 0 */ + u32 size; /* covered section size in bytes */ + u32 table_offset; /* byte offset from blob start to this section's + * mod_lineinfo_header */ +}; + +/* + * Top-level header. Sits at offset 0 of every .mod_lineinfo / + * .init.mod_lineinfo section. The compiler inserts 4 bytes of trailing + * padding so the u64 anchor in sections[0] starts 8-byte aligned. + */ +struct mod_lineinfo_root { + u32 num_sections; + struct mod_lineinfo_section sections[]; +}; + +struct mod_lineinfo_header { + u32 num_entries; + u32 num_files; + u32 filenames_size; /* total bytes of concatenated filenames */ +}; + +/* Offset helpers: compute byte offset from the per-section header to each= array. */ + +static inline u32 mod_lineinfo_addrs_off(void) +{ + return sizeof(struct mod_lineinfo_header); +} + +static inline u32 mod_lineinfo_file_ids_off(u32 num_entries) +{ + return mod_lineinfo_addrs_off() + num_entries * sizeof(u32); +} + +static inline u32 mod_lineinfo_lines_off(u32 num_entries) +{ + /* u16 file_ids[] may need 2-byte padding to align lines[] to 4 bytes */ + u32 off =3D mod_lineinfo_file_ids_off(num_entries) + + num_entries * sizeof(u16); + return (off + 3) & ~3u; +} + +static inline u32 mod_lineinfo_file_offsets_off(u32 num_entries) +{ + return mod_lineinfo_lines_off(num_entries) + num_entries * sizeof(u32); +} + +static inline u32 mod_lineinfo_filenames_off(u32 num_entries, u32 num_file= s) +{ + return mod_lineinfo_file_offsets_off(num_entries) + + num_files * sizeof(u32); +} + +#endif /* _LINUX_MOD_LINEINFO_H */ diff --git a/include/linux/module.h b/include/linux/module.h index 7566815fabbe8..2bc0263b086d2 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -507,6 +507,12 @@ struct module { void *btf_data; void *btf_base_data; #endif +#ifdef CONFIG_KALLSYMS_LINEINFO_MODULES + void *lineinfo_data; /* .mod_lineinfo section in MOD_RODATA */ + unsigned int lineinfo_data_size; + void *init_lineinfo_data; /* .init.mod_lineinfo, NULL after init runs */ + unsigned int init_lineinfo_data_size; +#endif #ifdef CONFIG_JUMP_LABEL struct jump_entry *jump_entries; unsigned int num_jump_entries; @@ -1020,6 +1026,39 @@ static inline unsigned long find_kallsyms_symbol_val= ue(struct module *mod, =20 #endif /* CONFIG_MODULES && CONFIG_KALLSYMS */ =20 +bool module_lookup_lineinfo(struct module *mod, unsigned long addr, + const char **file, unsigned int *line); + +/* + * Reader accessors so callers don't need to duplicate the + * CONFIG_KALLSYMS_LINEINFO_MODULES guard around mod->lineinfo_data / + * mod->init_lineinfo_data field access. Setters/clearers in the loader + * use the field directly under a matching #ifdef. + */ +static inline void *module_lineinfo_data(const struct module *mod, + unsigned int *size) +{ +#ifdef CONFIG_KALLSYMS_LINEINFO_MODULES + *size =3D mod->lineinfo_data_size; + return mod->lineinfo_data; +#else + *size =3D 0; + return NULL; +#endif +} + +static inline void *module_init_lineinfo_data(const struct module *mod, + unsigned int *size) +{ +#ifdef CONFIG_KALLSYMS_LINEINFO_MODULES + *size =3D READ_ONCE(mod->init_lineinfo_data_size); + return READ_ONCE(mod->init_lineinfo_data); +#else + *size =3D 0; + return NULL; +#endif +} + /* Define __free(module_put) macro for struct module *. */ DEFINE_FREE(module_put, struct module *, if (_T) module_put(_T)) =20 diff --git a/init/Kconfig b/init/Kconfig index 99e78c253056a..3e3acfc37be7e 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2085,8 +2085,23 @@ config KALLSYMS_LINEINFO anon_vma_clone+0x2ed/0xcf0 (mm/rmap.c:412) =20 This requires elfutils (libdw-dev/elfutils-devel) on the build host. - Adds approximately 44MB to a typical kernel image (10 bytes per - DWARF line-table entry, ~4.6M entries for a typical config). + Adds approximately 10-15MB to a typical kernel image (~2-3 bytes + per entry after delta compression, ~4.6M entries for a typical + config). + + If unsure, say N. + +config KALLSYMS_LINEINFO_MODULES + bool "Embed source file:line information in module stack traces" + depends on KALLSYMS_LINEINFO && MODULES + help + Extends KALLSYMS_LINEINFO to loadable kernel modules. Each .ko + gets a lineinfo table generated from its DWARF data at build time, + so stack traces from module code include (file.c:123) annotations. + + Requires elfutils (libdw-dev/elfutils-devel) on the build host. + Increases .ko sizes by approximately 2-3 bytes per DWARF line + entry after delta compression. =20 If unsure, say N. =20 diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index 1e3f527b13988..d95387f51b4c0 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -551,12 +551,24 @@ static int __sprint_symbol(char *buffer, unsigned lon= g address, * replaced with bar()"); appending lineinfo there would produce a * confusing "foo (file:line)()". */ - if (add_lineinfo && IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) && !modname) { + if (add_lineinfo && IS_ENABLED(CONFIG_KALLSYMS_LINEINFO)) { const char *li_file; unsigned int li_line; + bool found =3D false; + + if (!modname) { + found =3D kallsyms_lookup_lineinfo(address, + &li_file, &li_line); + } else if (IS_ENABLED(CONFIG_KALLSYMS_LINEINFO_MODULES)) { + struct module *mod =3D __module_address(address); + + if (mod) + found =3D module_lookup_lineinfo(mod, address, + &li_file, + &li_line); + } =20 - if (kallsyms_lookup_lineinfo(address, - &li_file, &li_line)) + if (found) len +=3D snprintf(buffer + len, KSYM_SYMBOL_LEN - len, " (%s:%u)", li_file, li_line); } diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c index 0fc11e45df9b9..819d6594c2937 100644 --- a/kernel/module/kallsyms.c +++ b/kernel/module/kallsyms.c @@ -494,3 +494,184 @@ int module_kallsyms_on_each_symbol(const char *modnam= e, mutex_unlock(&module_mutex); return ret; } + +#include + +/* + * Search one per-section sub-table for @section_offset using flat parallel + * arrays. @hdr is the per-section header at byte offset @hdr_offset with= in + * @blob. Returns true on hit and populates @file / @line. + */ +static bool module_lookup_lineinfo_section(const void *blob, u32 blob_size, + u32 hdr_offset, + unsigned int section_offset, + const char **file, + unsigned int *line) +{ + const struct mod_lineinfo_header *hdr; + const u8 *base; + const u32 *addrs, *lines, *file_offsets; + const u16 *file_ids; + const char *filenames; + u32 num_entries, num_files, filenames_size; + unsigned int low, high, mid; + u16 file_id; + + if (hdr_offset > blob_size || + blob_size - hdr_offset < sizeof(*hdr)) + return false; + + base =3D (const u8 *)blob + hdr_offset; + hdr =3D (const struct mod_lineinfo_header *)base; + num_entries =3D hdr->num_entries; + num_files =3D hdr->num_files; + filenames_size =3D hdr->filenames_size; + + if (num_entries =3D=3D 0) + return false; + + /* + * Validate counts before multiplying =E2=80=94 sizing arithmetic could + * otherwise overflow on 32-bit with a malformed blob. Each entry + * contributes one u32 (addrs), one u16 (file_ids), and one u32 + * (lines); each file contributes one u32 (file_offsets). + */ + { + u32 avail =3D blob_size - hdr_offset; + u32 needed =3D mod_lineinfo_filenames_off(num_entries, num_files); + + if (num_entries > U32_MAX / sizeof(u32)) + return false; + if (num_files > U32_MAX / sizeof(u32)) + return false; + if (needed > avail || filenames_size > avail - needed) + return false; + } + + /* + * Filenames are read as NUL-terminated C strings. Require the blob + * to end in NUL so a malformed file_offsets entry can never lead the + * later "%s" consumer past the end of the section. + */ + if (filenames_size =3D=3D 0 || + base[mod_lineinfo_filenames_off(num_entries, num_files) + + filenames_size - 1] !=3D 0) + return false; + + addrs =3D (const u32 *)(base + mod_lineinfo_addrs_off()); + file_ids =3D (const u16 *)(base + mod_lineinfo_file_ids_off(num_entries)); + lines =3D (const u32 *)(base + mod_lineinfo_lines_off(num_entries)); + file_offsets =3D (const u32 *)(base + mod_lineinfo_file_offsets_off(num_e= ntries)); + filenames =3D (const char *)(base + mod_lineinfo_filenames_off(num_entrie= s, num_files)); + + /* Binary search for largest entry <=3D section_offset. */ + low =3D 0; + high =3D num_entries; + while (low < high) { + mid =3D low + (high - low) / 2; + if (addrs[mid] <=3D section_offset) + low =3D mid + 1; + else + high =3D mid; + } + + if (low =3D=3D 0) + return false; + low--; + + file_id =3D file_ids[low]; + if (file_id >=3D num_files) + return false; + if (file_offsets[file_id] >=3D filenames_size) + return false; + + *file =3D &filenames[file_offsets[file_id]]; + *line =3D lines[low]; + return true; +} + +/* + * Walk a single .mod_lineinfo / .init.mod_lineinfo blob, find the section + * descriptor whose [anchor, anchor+size) range contains @addr, then search + * that section's sub-table. + */ +static bool module_lookup_lineinfo_blob(const void *blob, u32 blob_size, + unsigned long addr, + const char **file, unsigned int *line) +{ + const struct mod_lineinfo_root *root; + u32 i, sections_end; + + if (!blob || blob_size < sizeof(*root)) + return false; + + root =3D blob; + if (root->num_sections =3D=3D 0) + return false; + + if (root->num_sections > U32_MAX / sizeof(struct mod_lineinfo_section)) + return false; + sections_end =3D sizeof(*root) + + root->num_sections * sizeof(struct mod_lineinfo_section); + if (sections_end > blob_size) + return false; + + for (i =3D 0; i < root->num_sections; i++) { + const struct mod_lineinfo_section *s =3D &root->sections[i]; + unsigned long base =3D (unsigned long)s->anchor; + unsigned long offset; + + if (!base) + continue; /* relocation didn't resolve */ + if (addr < base) + continue; + offset =3D addr - base; + if (offset >=3D s->size) + continue; + if (offset > U32_MAX) + continue; + + return module_lookup_lineinfo_section(blob, blob_size, + s->table_offset, + (unsigned int)offset, + file, line); + } + + return false; +} + +/* + * Look up source file:line for an address within a loaded module. + * + * Safe in NMI/panic context: no locks, no allocations. + * Caller must hold RCU read lock (or be in a context where the module + * cannot be unloaded). + */ +bool module_lookup_lineinfo(struct module *mod, unsigned long addr, + const char **file, unsigned int *line) +{ + const void *blob; + unsigned int size; + + if (!IS_ENABLED(CONFIG_KALLSYMS_LINEINFO_MODULES)) + return false; + + blob =3D module_lineinfo_data(mod, &size); + if (blob && module_lookup_lineinfo_blob(blob, size, addr, file, line)) + return true; + + /* + * The init blob lives in MOD_INIT_RODATA and is revoked by + * do_init_module() before do_free_init() releases the memory. The + * READ_ONCE inside module_init_lineinfo_data() pairs with the + * WRITE_ONCE in do_init_module so we never see a partial + * pointer/size pair, and an RCU grace period in do_free_init() + * guarantees the memory still exists for the duration of any lookup + * that captured the pointer before the revocation. + */ + blob =3D module_init_lineinfo_data(mod, &size); + if (blob && module_lookup_lineinfo_blob(blob, size, addr, file, line)) + return true; + + return false; +} diff --git a/kernel/module/main.c b/kernel/module/main.c index 46dd8d25a6058..46bb2bf799d1e 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -2712,6 +2712,19 @@ static int find_module_sections(struct module *mod, = struct load_info *info) mod->btf_base_data =3D any_section_objs(info, ".BTF.base", 1, &mod->btf_base_data_size); #endif +#ifdef CONFIG_KALLSYMS_LINEINFO_MODULES + /* + * Use section_objs() (not any_section_objs) =E2=80=94 both blobs carry an + * ELF anchor relocation that the module loader resolves via its + * standard apply_relocations() pass, which only walks SHF_ALLOC + * sections. Picking up a non-ALLOC section here would also leave + * the pointer dangling into the temporary load image once freed. + */ + mod->lineinfo_data =3D section_objs(info, ".mod_lineinfo", 1, + &mod->lineinfo_data_size); + mod->init_lineinfo_data =3D section_objs(info, ".init.mod_lineinfo", 1, + &mod->init_lineinfo_data_size); +#endif #ifdef CONFIG_JUMP_LABEL mod->jump_entries =3D section_objs(info, "__jump_table", sizeof(*mod->jump_entries), @@ -3165,6 +3178,19 @@ static noinline int do_init_module(struct module *mo= d) /* .BTF is not SHF_ALLOC and will get removed, so sanitize pointers */ mod->btf_data =3D NULL; mod->btf_base_data =3D NULL; +#endif +#ifdef CONFIG_KALLSYMS_LINEINFO_MODULES + /* + * .init.mod_lineinfo lives in MOD_INIT_RODATA which do_free_init() is + * about to release. Clear the pointer so concurrent stack-trace + * lookups stop dereferencing it; do_free_init()'s synchronize_rcu() + * then waits out any reader that already captured the old pointer. + * WRITE_ONCE pairs with the READ_ONCE inside module_init_lineinfo_data() + * so the compiler can't tear or reorder the revocation across the + * llist_add() that follows. + */ + WRITE_ONCE(mod->init_lineinfo_data, NULL); + WRITE_ONCE(mod->init_lineinfo_data_size, 0); #endif /* * We want to free module_init, but be aware that kallsyms may be diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index adcbcde16a071..3941cf624526b 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -46,6 +46,9 @@ quiet_cmd_btf_ko =3D BTF [M] $@ $(CONFIG_SHELL) $(srctree)/scripts/gen-btf.sh --btf_base $(objtree)/vmli= nux $@; \ fi; =20 +quiet_cmd_lineinfo_ko =3D LINEINFO [M] $@ + cmd_lineinfo_ko =3D $(CONFIG_SHELL) $(srctree)/scripts/gen-mod-linei= nfo.sh $@ + # Same as newer-prereqs, but allows to exclude specified extra dependencies newer_prereqs_except =3D $(filter-out $(PHONY) $(1),$?) =20 @@ -59,6 +62,9 @@ if_changed_except =3D $(if $(call newer_prereqs_except,$(= 2))$(cmd-check), \ +$(call if_changed_except,ld_ko_o,$(objtree)/vmlinux) ifdef CONFIG_DEBUG_INFO_BTF_MODULES +$(if $(newer-prereqs),$(call cmd,btf_ko)) +endif +ifdef CONFIG_KALLSYMS_LINEINFO_MODULES + +$(if $(newer-prereqs),$(call cmd,lineinfo_ko)) endif +$(call cmd,check_tracepoint) =20 diff --git a/scripts/gen-mod-lineinfo.sh b/scripts/gen-mod-lineinfo.sh new file mode 100644 index 0000000000000..832d290f3bf4c --- /dev/null +++ b/scripts/gen-mod-lineinfo.sh @@ -0,0 +1,50 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# +# gen-mod-lineinfo.sh - Embed source line info into a kernel module (.ko) +# +# Reads DWARF from the .ko, generates a .mod_lineinfo section that contains +# an ELF relocation against the module's .text section symbol, and partial- +# links the result back into the .ko via "ld -r" so the relocation rides +# along to the module loader. Modeled on scripts/gen-btf.sh. + +set -e + +if [ $# -ne 1 ]; then + echo "Usage: $0 " >&2 + exit 1 +fi + +KO=3D"$1" + +cleanup() { + rm -f "${KO}.lineinfo.S" "${KO}.lineinfo.o" "${KO}.lineinfo.tmp" +} +trap cleanup EXIT + +case "${KBUILD_VERBOSE}" in +*1*) + set -x + ;; +esac + +# Generate assembly from DWARF -- if it fails (no DWARF), silently skip +if ! ${objtree}/scripts/gen_lineinfo --module "${KO}" > "${KO}.lineinfo.S"= ; then + exit 0 +fi + +# Compile assembly to object file +${CC} ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS} \ + ${KBUILD_AFLAGS} ${KBUILD_AFLAGS_MODULE} \ + -c -o "${KO}.lineinfo.o" "${KO}.lineinfo.S" + +# Partial-link lineinfo.o INTO the .ko. Order matters: lineinfo.o must co= me +# FIRST so its empty .text contributes 0 bytes at offset 0 of the merged +# .text, which keeps the .quad .text relocation (against lineinfo.o's local +# .text symbol, which after merge points at offset 0 of merged .text) +# resolving to the start of the module's .text. Reversing inputs here +# silently breaks lookup correctness. +${LD} -r "${KO}.lineinfo.o" "${KO}" -o "${KO}.lineinfo.tmp" +mv "${KO}.lineinfo.tmp" "${KO}" + +exit 0 diff --git a/scripts/gen_lineinfo.c b/scripts/gen_lineinfo.c index 699e760178f09..e1e08469b4f2f 100644 --- a/scripts/gen_lineinfo.c +++ b/scripts/gen_lineinfo.c @@ -24,16 +24,66 @@ #include #include =20 +#include "../include/linux/mod_lineinfo.h" + +static int module_mode; + static unsigned int skipped_overflow; =20 +/* Target ELF traits, captured once in main() and reused at emit time. */ +static bool target_64bit; +static bool target_le; + /* - * vmlinux mode: end of the invariant .text region. Zero means "no cap" - * (graceful fallback when _etext is absent on some build). + * Vmlinux mode only: address range of the *invariant* .text region. + * See find_text_end_addr() for why we cap on _etext. text_end_addr =3D= =3D 0 + * means "no cap available; capture everything above text_addr" (v3 + * behavior, used as graceful fallback if _etext is absent). */ static unsigned long long text_end_addr; =20 +/* + * In module mode we cover several text-like sections, split across two + * output blobs by lifecycle: + * + * .mod_lineinfo -- persistent code (.text, .exit.text); MOD_RODATA + * .init.mod_lineinfo -- init code (.init.text); freed with init memory + * + * In ET_REL .ko files .text/.init.text/.exit.text all have sh_addr =3D=3D= 0, + * so DWARF line addresses (which become sh_addr + addend after relocation) + * collide across sections. We disambiguate by giving each *present* + * covered section a unique synthetic "bias" =E2=80=94 a u32 base address = =E2=80=94 and + * adding that bias to relocated values inside apply_debug_line_relocation= s. + * libdw then yields biased addresses that classify_address() can map back + * to a single section unambiguously. The bias is internal to gen_lineinfo + * and never leaks into the emitted blob. + */ +enum mod_lineinfo_blob { + BLOB_PERSISTENT, + BLOB_INIT, + NUM_BLOBS, +}; + +struct covered_section { + const char *name; /* ELF section name (e.g. ".text") */ + enum mod_lineinfo_blob blob; + unsigned long long bias;/* synthetic base address (set in resolve_*) */ + unsigned long long size; + bool present; /* found in this .ko */ + unsigned int sec_index; /* ELF section header index, for reloc matching */ + unsigned int n_entries; /* DWARF line entries collected for this section = */ +}; + +static struct covered_section all_sections[] =3D { + { .name =3D ".text", .blob =3D BLOB_PERSISTENT }, + { .name =3D ".exit.text", .blob =3D BLOB_PERSISTENT }, + { .name =3D ".init.text", .blob =3D BLOB_INIT }, +}; +#define ALL_SECTIONS (sizeof(all_sections) / sizeof(all_sections[0])) + struct line_entry { - unsigned int offset; /* offset from _text */ + unsigned int offset; /* offset from covered section's start */ + unsigned int section_id;/* index into covered_sections[] (module mode onl= y) */ unsigned int file_id; unsigned int line; }; @@ -52,7 +102,12 @@ static struct file_entry *files; static unsigned int num_files; static unsigned int files_capacity; =20 -#define FILE_HASH_BITS 13 +/* + * Hash size must comfortably exceed the 65535-file cap below so the open + * addressing in find_or_add_file() always has a free slot to land on. + * 17 bits =3D 131072 entries gives ~50% max load factor. + */ +#define FILE_HASH_BITS 17 #define FILE_HASH_SIZE (1 << FILE_HASH_BITS) =20 struct file_hash_entry { @@ -71,8 +126,8 @@ static unsigned int hash_str(const char *s) return h & (FILE_HASH_SIZE - 1); } =20 -static void add_entry(unsigned int offset, unsigned int file_id, - unsigned int line) +static void add_entry(unsigned int offset, unsigned int section_id, + unsigned int file_id, unsigned int line) { if (num_entries >=3D entries_capacity) { entries_capacity =3D entries_capacity ? entries_capacity * 2 : 65536; @@ -83,6 +138,7 @@ static void add_entry(unsigned int offset, unsigned int = file_id, } } entries[num_entries].offset =3D offset; + entries[num_entries].section_id =3D section_id; entries[num_entries].file_id =3D file_id; entries[num_entries].line =3D line; num_entries++; @@ -155,27 +211,25 @@ static const char *make_relative(const char *path, co= nst char *comp_dir) { const char *p; =20 - /* If already relative, use as-is */ - if (path[0] !=3D '/') - return path; - - /* comp_dir from DWARF is the most reliable method */ - if (comp_dir) { - size_t len =3D strlen(comp_dir); - - if (!strncmp(path, comp_dir, len) && path[len] =3D=3D '/') { - const char *rel =3D path + len + 1; - - /* - * If comp_dir pointed to a subdirectory - * (e.g. arch/parisc/kernel) rather than - * the tree root, stripping it leaves a - * bare filename. Fall through to the - * kernel_dirs scan so we recover the full - * relative path instead. - */ - if (strchr(rel, '/')) - return rel; + if (path[0] =3D=3D '/') { + /* Try comp_dir prefix from DWARF */ + if (comp_dir) { + size_t len =3D strlen(comp_dir); + + if (!strncmp(path, comp_dir, len) && path[len] =3D=3D '/') { + const char *rel =3D path + len + 1; + + /* + * If comp_dir pointed to a subdirectory + * (e.g. arch/parisc/kernel) rather than + * the tree root, stripping it leaves a + * bare filename. Fall through to the + * kernel_dirs scan so we recover the full + * relative path instead. + */ + if (strchr(rel, '/')) + return rel; + } } =20 /* @@ -201,9 +255,42 @@ static const char *make_relative(const char *path, con= st char *comp_dir) return p ? p + 1 : path; } =20 - /* Fall back to basename */ - p =3D strrchr(path, '/'); - return p ? p + 1 : path; + /* + * Relative path =E2=80=94 check for duplicated-path quirk from libdw + * on ET_REL files (e.g., "a/b.c/a/b.c" =E2=86=92 "a/b.c"). + */ + { + size_t len =3D strlen(path); + size_t mid =3D len / 2; + + if (len > 1 && path[mid] =3D=3D '/' && + !memcmp(path, path + mid + 1, mid)) + return path + mid + 1; + } + + /* + * Bare filename with no directory component =E2=80=94 try to recover the + * relative path using comp_dir. Some toolchains/elfutils combos + * produce bare filenames where comp_dir holds the source directory. + * Construct the absolute path and run the kernel_dirs scan. + */ + if (!strchr(path, '/') && comp_dir && comp_dir[0] =3D=3D '/') { + static char buf[PATH_MAX]; + + snprintf(buf, sizeof(buf), "%s/%s", comp_dir, path); + for (p =3D buf + 1; *p; p++) { + if (*(p - 1) =3D=3D '/') { + for (unsigned int i =3D 0; i < sizeof(kernel_dirs) / + sizeof(kernel_dirs[0]); i++) { + if (!strncmp(p, kernel_dirs[i], + strlen(kernel_dirs[i]))) + return p; + } + } + } + } + + return path; } =20 static int compare_entries(const void *a, const void *b) @@ -211,6 +298,9 @@ static int compare_entries(const void *a, const void *b) const struct line_entry *ea =3D a; const struct line_entry *eb =3D b; =20 + /* Group by section first so each per-section table is contiguous. */ + if (ea->section_id !=3D eb->section_id) + return ea->section_id < eb->section_id ? -1 : 1; if (ea->offset !=3D eb->offset) return ea->offset < eb->offset ? -1 : 1; if (ea->file_id !=3D eb->file_id) @@ -222,7 +312,8 @@ static int compare_entries(const void *a, const void *b) =20 /* * Look up a vmlinux symbol by exact name and return its st_value, or - * @fallback if absent. Aborts when @required and the symbol is missing. + * @fallback if the symbol is absent (lets callers gracefully skip + * optional bounds like _etext). */ static unsigned long long find_vmlinux_sym(Elf *elf, const char *name, unsigned long long fallback, @@ -270,22 +361,325 @@ static unsigned long long find_text_addr(Elf *elf) } =20 /* - * vmlinux is linked in multiple passes: gen_lineinfo runs against - * .tmp_vmlinux1 (which carries an empty lineinfo stub), then real tables - * are linked in for the final image. Sections placed AFTER .rodata - * (.init.text, .exit.text, ...) shift forward as .rodata grows to hold - * the real lineinfo blob, so DWARF addresses we'd capture for them in - * pass 1 would be stale in the final kernel. Cap captured addresses at - * _etext, the symbol that marks the end of .text =E2=80=94 placed before = .rodata - * in every architecture's vmlinux.lds.S, so its addresses are invariant - * across the relink. Returns 0 if _etext is absent (no cap; v3 behavior). + * Vmlinux is linked in multiple passes: gen_lineinfo runs against + * .tmp_vmlinux1 (which carries the empty lineinfo stub), and the resulting + * tables are then linked into the final vmlinux. Sections placed AFTER + * .rodata (.init.text, .exit.text, ...) shift forward as the real lineinfo + * tables replace the empty stub, so DWARF addresses we'd capture for them + * here are stale by the time the kernel runs. + * + * Cap the captured range at _etext, the symbol that marks the end of the + * .text section. .text is placed BEFORE .rodata in every architecture's + * vmlinux.lds.S, so its addresses are invariant across the relink. + * Returns 0 on architectures or builds that don't expose _etext, in which + * case the cap is disabled (preserving the v3 behavior =E2=80=94 addresse= s past + * .text remain captured but may be off in stack traces). */ static unsigned long long find_text_end_addr(Elf *elf) { return find_vmlinux_sym(elf, "_etext", 0, false); } =20 -static void process_dwarf(Dwarf *dwarf, unsigned long long text_addr) +/* + * Populate @sections[].present/sec_index/size/bias. Sections that don't + * exist stay marked absent. Biases are assigned in array order: each + * present section gets a base equal to the running total of preceding + * present sections' sizes, rounded up to 16 to keep ranges sparse. This + * guarantees [bias, bias+size) ranges are pairwise disjoint and fit in + * u32 as long as the sum of all covered text sizes is below 4 GiB. + */ +static void resolve_covered_sections(Elf *elf, + struct covered_section *sections, + unsigned int num_sections) +{ + Elf_Scn *scn =3D NULL; + GElf_Shdr shdr; + size_t shstrndx; + unsigned long long cursor =3D 0; + + if (elf_getshdrstrndx(elf, &shstrndx) !=3D 0) + return; + + while ((scn =3D elf_nextscn(elf, scn)) !=3D NULL) { + const char *name; + + if (!gelf_getshdr(scn, &shdr)) + continue; + name =3D elf_strptr(elf, shstrndx, shdr.sh_name); + if (!name) + continue; + for (unsigned int i =3D 0; i < num_sections; i++) { + if (sections[i].present) + continue; + if (strcmp(name, sections[i].name)) + continue; + if (shdr.sh_size > UINT_MAX) { + fprintf(stderr, + "lineinfo: section %s exceeds 4 GiB (size=3D%llu); skipping\n", + name, + (unsigned long long)shdr.sh_size); + break; + } + sections[i].sec_index =3D elf_ndxscn(scn); + sections[i].size =3D shdr.sh_size; + sections[i].present =3D true; + break; + } + } + + /* Pack present sections into non-overlapping bias ranges. */ + for (unsigned int i =3D 0; i < num_sections; i++) { + if (!sections[i].present) + continue; + sections[i].bias =3D cursor; + cursor +=3D sections[i].size; + cursor =3D (cursor + 15) & ~15ULL; /* pad for separation */ + } +} + +/* Look up a covered_section by ELF section header index. */ +static struct covered_section *section_by_index(struct covered_section *se= ctions, + unsigned int num_sections, + unsigned int sec_index) +{ + for (unsigned int i =3D 0; i < num_sections; i++) { + if (sections[i].present && sections[i].sec_index =3D=3D sec_index) + return §ions[i]; + } + return NULL; +} + +/* + * Apply .rela.debug_line relocations to a mutable copy of .debug_line dat= a. + * + * elfutils libdw (through at least 0.194) does NOT apply relocations for + * ET_REL files when using dwarf_begin_elf(). The internal libdwfl layer + * does this via __libdwfl_relocate(), but that API is not public. + * + * For DWARF5, the .debug_line file name table uses DW_FORM_line_strp + * references into .debug_line_str. Without relocation, all these offsets + * resolve to 0 (or garbage), causing dwarf_linesrc()/dwarf_filesrc() to + * return wrong filenames (typically the comp_dir for every file). + * + * This function applies the relocations manually so that the patched + * .debug_line data can be fed to dwarf_begin_elf() and produce correct + * results. + * + * See elfutils bug https://sourceware.org/bugzilla/show_bug.cgi?id=3D31447 + * A fix (dwelf_elf_apply_relocs) was proposed but not yet merged as of + * elfutils 0.194: https://sourceware.org/pipermail/elfutils-devel/2024q3/= 007388.html + */ +/* + * Determine the relocation type for a 32-bit absolute reference + * on the given architecture. Returns 0 if unknown. + */ +static unsigned int r_type_abs32(unsigned int e_machine) +{ + switch (e_machine) { + case EM_X86_64: return R_X86_64_32; + case EM_386: return R_386_32; + case EM_AARCH64: return R_AARCH64_ABS32; + case EM_ARM: return R_ARM_ABS32; + case EM_RISCV: return R_RISCV_32; + case EM_S390: return R_390_32; + case EM_MIPS: return R_MIPS_32; + case EM_PPC64: return R_PPC64_ADDR32; + case EM_PPC: return R_PPC_ADDR32; + case EM_LOONGARCH: return R_LARCH_32; + case EM_PARISC: return R_PARISC_DIR32; + default: return 0; + } +} + +/* + * Determine the relocation type for a 64-bit absolute reference + * on the given architecture. Returns 0 on 32-bit-only architectures + * (where DW_LNE_set_address fits in 32 bits and r_type_abs32 covers it). + */ +static unsigned int r_type_abs64(unsigned int e_machine) +{ + switch (e_machine) { + case EM_X86_64: return R_X86_64_64; + case EM_AARCH64: return R_AARCH64_ABS64; + case EM_RISCV: return R_RISCV_64; + case EM_S390: return R_390_64; + case EM_MIPS: return R_MIPS_64; + case EM_PPC64: return R_PPC64_ADDR64; + case EM_LOONGARCH: return R_LARCH_64; + case EM_PARISC: return R_PARISC_DIR64; + default: return 0; + } +} + +/* + * Write a 4- or 8-byte unsigned integer in target byte order. + * Cross-builds (e.g. x86_64 host -> s390 module) need the patched + * .debug_line bytes laid out per the .ko's e_ident[EI_DATA], not the host= 's. + */ +static void elf_write_uint(unsigned char *dst, uint64_t value, size_t size, + bool little_endian) +{ + if (little_endian) { + for (size_t i =3D 0; i < size; i++) + dst[i] =3D (value >> (i * 8)) & 0xff; + } else { + for (size_t i =3D 0; i < size; i++) + dst[i] =3D (value >> ((size - 1 - i) * 8)) & 0xff; + } +} + +static void apply_debug_line_relocations(Elf *elf) +{ + Elf_Scn *scn =3D NULL; + Elf_Scn *debug_line_scn =3D NULL; + Elf_Scn *rela_debug_line_scn =3D NULL; + Elf_Scn *symtab_scn =3D NULL; + GElf_Shdr shdr; + GElf_Ehdr ehdr; + unsigned int abs32_type, abs64_type; + bool target_le; + size_t shstrndx; + Elf_Data *dl_data, *rela_data, *sym_data; + GElf_Shdr rela_shdr, sym_shdr; + size_t nrels, i; + + if (gelf_getehdr(elf, &ehdr) =3D=3D NULL) + return; + + abs32_type =3D r_type_abs32(ehdr.e_machine); + abs64_type =3D r_type_abs64(ehdr.e_machine); + if (!abs32_type && !abs64_type) + return; + target_le =3D (ehdr.e_ident[EI_DATA] =3D=3D ELFDATA2LSB); + + if (elf_getshdrstrndx(elf, &shstrndx) !=3D 0) + return; + + /* Find the relevant sections */ + while ((scn =3D elf_nextscn(elf, scn)) !=3D NULL) { + const char *name; + + if (!gelf_getshdr(scn, &shdr)) + continue; + name =3D elf_strptr(elf, shstrndx, shdr.sh_name); + if (!name) + continue; + + if (!strcmp(name, ".debug_line")) + debug_line_scn =3D scn; + else if (!strcmp(name, ".rela.debug_line")) + rela_debug_line_scn =3D scn; + else if (shdr.sh_type =3D=3D SHT_SYMTAB) + symtab_scn =3D scn; + } + + if (!debug_line_scn || !rela_debug_line_scn || !symtab_scn) + return; + + dl_data =3D elf_getdata(debug_line_scn, NULL); + rela_data =3D elf_getdata(rela_debug_line_scn, NULL); + sym_data =3D elf_getdata(symtab_scn, NULL); + if (!dl_data || !rela_data || !sym_data) + return; + + if (!gelf_getshdr(rela_debug_line_scn, &rela_shdr)) + return; + if (!gelf_getshdr(symtab_scn, &sym_shdr)) + return; + + nrels =3D rela_shdr.sh_size / rela_shdr.sh_entsize; + + for (i =3D 0; i < nrels; i++) { + GElf_Rela rela; + GElf_Sym sym; + unsigned int r_type; + size_t r_sym; + bool is_abs64; + + if (!gelf_getrela(rela_data, i, &rela)) + continue; + + r_type =3D GELF_R_TYPE(rela.r_info); + r_sym =3D GELF_R_SYM(rela.r_info); + + /* + * Two reloc widths matter for .debug_line: + * abs32 - DW_FORM_line_strp file-table refs into .debug_line_str + * abs64 - DW_LNE_set_address arguments (sequence start PCs) + * Without both, libdw sees zeros and reports wrong filenames or + * collapses every sequence to address 0 (collision after dedup). + */ + if (abs32_type && r_type =3D=3D abs32_type) { + is_abs64 =3D false; + } else if (abs64_type && r_type =3D=3D abs64_type) { + is_abs64 =3D true; + } else { + continue; + } + + if (!gelf_getsym(sym_data, r_sym, &sym)) + continue; + + size_t width =3D is_abs64 ? 8 : 4; + uint64_t value =3D (uint64_t)(sym.st_value + rela.r_addend); + + /* + * If the relocation targets one of our covered text sections, + * fold in that section's synthetic bias so the patched DWARF + * address lands in a unique numeric range. String-ref relocs + * (DW_FORM_line_strp into .debug_line_str) target a different + * section, so the symbol-based check correctly excludes them + * from biasing =E2=80=94 for both abs64 (64-bit ELF) and abs32 (32-bit + * ELF, where DW_LNE_set_address is also 4 bytes wide). + */ + if (module_mode) { + struct covered_section *cs; + + cs =3D section_by_index(all_sections, ALL_SECTIONS, + sym.st_shndx); + if (cs) + value +=3D cs->bias; + } + + if (!is_abs64) + value &=3D 0xffffffffULL; + + if (rela.r_offset + width <=3D dl_data->d_size) + elf_write_uint((unsigned char *)dl_data->d_buf + + rela.r_offset, + value, width, target_le); + } +} + +/* + * Decide which covered_section a (biased) DWARF address belongs to. + * apply_debug_line_relocations() has already added the section's bias to + * each line-program PC, so [bias, bias+size) ranges are pairwise disjoint + * and a simple linear scan picks the right bucket. Returns the index + * within @sections, or @num_sections if @addr falls outside every + * present range (caller skips the entry). + */ +static unsigned int classify_address(struct covered_section *sections, + unsigned int num_sections, + unsigned long long addr, + unsigned long long *out_offset) +{ + for (unsigned int i =3D 0; i < num_sections; i++) { + if (!sections[i].present) + continue; + if (addr < sections[i].bias) + continue; + if (addr >=3D sections[i].bias + sections[i].size) + continue; + *out_offset =3D addr - sections[i].bias; + return i; + } + return num_sections; +} + +static void process_dwarf(Dwarf *dwarf, unsigned long long text_addr, + struct covered_section *sections, + unsigned int num_sections) { Dwarf_Off off =3D 0, next_off; size_t hdr_size; @@ -312,7 +706,8 @@ static void process_dwarf(Dwarf *dwarf, unsigned long l= ong text_addr) Dwarf_Addr addr; const char *src; const char *rel; - unsigned int file_id, loffset; + unsigned int file_id, loffset, sec_id; + unsigned long long sec_off; int lineno; =20 if (!line) @@ -329,56 +724,87 @@ static void process_dwarf(Dwarf *dwarf, unsigned long= long text_addr) if (!src) continue; =20 - if (addr < text_addr) - continue; - /* - * Skip addresses past _etext. Sections after .rodata - * shift when the real lineinfo replaces the empty stub - * during the multi-pass vmlinux link, so any address - * we'd capture there would be stale by the time the - * final kernel runs. - */ - if (text_end_addr && addr >=3D text_end_addr) - continue; - - { - unsigned long long raw_offset =3D addr - text_addr; + if (module_mode) { + /* + * In ET_REL .ko files .text/.init.text/.exit.text + * all share sh_addr =3D=3D 0; classify_address picks + * the right bucket from the explicit ranges we + * captured. + */ + sec_id =3D classify_address(sections, num_sections, + addr, &sec_off); + if (sec_id =3D=3D num_sections) + continue; + if (sec_off > UINT_MAX) { + skipped_overflow++; + continue; + } + loffset =3D (unsigned int)sec_off; + sections[sec_id].n_entries++; + } else { + unsigned long long raw_offset; =20 + if (addr < text_addr) + continue; + /* + * Skip addresses past _etext. Sections after + * .rodata shift when the real lineinfo replaces + * the empty stub during the multi-pass vmlinux + * link, so any address we'd capture there would + * be stale by the time the final kernel runs. + */ + if (text_end_addr && addr >=3D text_end_addr) + continue; + raw_offset =3D addr - text_addr; if (raw_offset > UINT_MAX) { skipped_overflow++; continue; } loffset =3D (unsigned int)raw_offset; + sec_id =3D 0; } =20 rel =3D make_relative(src, comp_dir); file_id =3D find_or_add_file(rel); =20 - add_entry(loffset, file_id, (unsigned int)lineno); + add_entry(loffset, sec_id, file_id, (unsigned int)lineno); } next: off =3D next_off; } } =20 -static void deduplicate(void) +static void deduplicate(struct covered_section *sections, + unsigned int num_sections) { unsigned int i, j; =20 if (num_entries < 2) return; =20 - /* Sort by offset, then file_id, then line for stability */ + /* + * Sort by section_id, then offset, then file_id, line. This groups + * each section's entries contiguously so the per-section emit can + * iterate a simple range, and ensures the binary search invariant + * (offsets ascending) holds within each section. + */ qsort(entries, num_entries, sizeof(*entries), compare_entries); =20 /* - * Remove duplicate entries: - * - Same offset: keep first (deterministic from stable sort keys) - * - Same file:line as previous kept entry: redundant for binary - * search -- any address between them resolves to the earlier one + * Remove duplicates. Reset on a section_id boundary: the same offset + * can legitimately appear in two different sections (they all start + * at sh_addr 0 in ET_REL), and the "same as previous kept entry" + * collapse is only meaningful inside one section's binary-search + * domain. */ j =3D 0; for (i =3D 1; i < num_entries; i++) { + if (entries[i].section_id !=3D entries[j].section_id) { + j++; + if (j !=3D i) + entries[j] =3D entries[i]; + continue; + } if (entries[i].offset =3D=3D entries[j].offset) continue; if (entries[i].file_id =3D=3D entries[j].file_id && @@ -389,6 +815,14 @@ static void deduplicate(void) entries[j] =3D entries[i]; } num_entries =3D j + 1; + + /* Recompute per-section n_entries from the deduped array. */ + if (sections) { + for (unsigned int k =3D 0; k < num_sections; k++) + sections[k].n_entries =3D 0; + for (i =3D 0; i < num_entries; i++) + sections[entries[i].section_id].n_entries++; + } } =20 static void compute_file_offsets(void) @@ -486,6 +920,199 @@ static void output_assembly(void) printf("\n"); } =20 +/* + * Emit one per-section table in the simple flat-array layout: + * + * mod_lineinfo_header + * addrs[count] (u32, sorted) + * file_ids[count] (u16) + 2-byte pad if count is odd + * lines[count] (u32) + * file_offsets[] (u32) + * filenames[] + * + * @suffix uniquifies labels so multiple tables can coexist in one blob. + * Caller has sorted entries[] so this section's entries occupy [first, + * first + count). + */ +static void emit_section_table(unsigned int first, unsigned int count, + const char *suffix) +{ + printf(".Lhdr%s:\n", suffix); + printf("\t.balign 4\n"); + printf("\t.long %u\t\t/* num_entries */\n", count); + printf("\t.long %u\t\t/* num_files */\n", num_files); + printf("\t.long .Lfilenames_end%s - .Lfilenames%s\n\n", suffix, suffix); + + /* addrs[] */ + for (unsigned int i =3D 0; i < count; i++) + printf("\t.long 0x%x\n", entries[first + i].offset); + + /* file_ids[] */ + for (unsigned int i =3D 0; i < count; i++) + printf("\t.short %u\n", entries[first + i].file_id); + if (count & 1) + printf("\t.short 0\t\t/* pad to align lines[] */\n"); + + /* lines[] */ + for (unsigned int i =3D 0; i < count; i++) + printf("\t.long %u\n", entries[first + i].line); + + /* file_offsets[] */ + printf("\t.balign 4\n"); + for (unsigned int i =3D 0; i < num_files; i++) + printf("\t.long %u\n", files[i].str_offset); + + /* filenames[] */ + printf(".Lfilenames%s:\n", suffix); + for (unsigned int i =3D 0; i < num_files; i++) + print_escaped_asciz(files[i].name); + printf(".Lfilenames_end%s:\n", suffix); +} + +/* + * Emit one mod_lineinfo_section descriptor. The "anchor" field is a + * relocation against the named ELF section symbol; the module loader + * resolves it on load to the runtime base of that section. + * + * On 64-bit ELF: 8-byte slot via .quad (R_*_64 reloc). + * On 32-bit ELF: 4-byte reloc via .long , plus 4 bytes of zero + * padding. The two halves are ordered to match target endianness so a + * naive u64 read on the kernel side recovers the relocated value. + */ +static void emit_section_descriptor(const char *section_name, + unsigned long long size, + const char *table_label, + const char *root_label) +{ + if (target_64bit) { + printf("\t.quad %s\t/* sections[].anchor (RELOC) */\n", + section_name); + } else if (target_le) { + printf("\t.long %s\t/* sections[].anchor low (RELOC) */\n", + section_name); + printf("\t.long 0\t\t/* sections[].anchor high pad */\n"); + } else { + printf("\t.long 0\t\t/* sections[].anchor high pad */\n"); + printf("\t.long %s\t/* sections[].anchor low (RELOC) */\n", + section_name); + } + printf("\t.long %llu\t/* sections[].size */\n", size); + printf("\t.long %s - %s\t/* sections[].table_offset */\n", + table_label, root_label); +} + +/* + * Emit one .mod_lineinfo / .init.mod_lineinfo blob. Walks all_sections[] + * picking only entries that (a) belong to the requested blob and (b) + * actually produced at least one DWARF line entry =E2=80=94 sections pres= ent in + * the .ko but without DWARF (e.g. compiler-generated stub thunks) are + * silently skipped. The caller-supplied entries[] is already sorted by + * section_id, so each section's entries are contiguous; we walk the + * master array in order to compute per-section starting indices. + */ +static void emit_blob(const char *output_section, + const char *blob_tag, + enum mod_lineinfo_blob blob) +{ + unsigned int active =3D 0; + unsigned int section_starts[ALL_SECTIONS]; + unsigned int cursor =3D 0; + + for (unsigned int i =3D 0; i < ALL_SECTIONS; i++) { + section_starts[i] =3D cursor; + cursor +=3D all_sections[i].n_entries; + if (all_sections[i].blob =3D=3D blob && all_sections[i].n_entries) + active++; + } + + if (!active) + return; + + printf("\t.section %s, \"a\"\n\n", output_section); + + printf("\t.balign 8\n"); + printf(".Lroot_%s:\n", blob_tag); + printf("\t.long %u\t\t/* num_sections */\n", active); + /* Pad to align the u64 anchor in sections[0] to 8 bytes. */ + printf("\t.balign 8\n"); + + { + unsigned int slot =3D 0; + for (unsigned int i =3D 0; i < ALL_SECTIONS; i++) { + char table_label[64]; + char root_label[64]; + + if (all_sections[i].blob !=3D blob) + continue; + if (!all_sections[i].n_entries) + continue; + snprintf(table_label, sizeof(table_label), + ".Lhdr_%s_%u", blob_tag, slot); + snprintf(root_label, sizeof(root_label), + ".Lroot_%s", blob_tag); + emit_section_descriptor(all_sections[i].name, + all_sections[i].size, + table_label, root_label); + slot++; + } + } + printf("\n"); + + { + unsigned int slot =3D 0; + + for (unsigned int i =3D 0; i < ALL_SECTIONS; i++) { + char suffix[64]; + + if (all_sections[i].blob !=3D blob) + continue; + if (!all_sections[i].n_entries) + continue; + snprintf(suffix, sizeof(suffix), "_%s_%u", + blob_tag, slot); + emit_section_table(section_starts[i], + all_sections[i].n_entries, + suffix); + slot++; + } + } + printf("\n"); +} + +/* + * Declare each text-like section we plan to reference as an empty + * SHF_EXECINSTR section in this object. Without these stanzas the + * assembler treats `.quad .exit.text` as an undefined external symbol; + * after ld -r the resulting GLOBAL UND `.exit.text` doesn't bind to the + * .ko's LOCAL SECTION symbol of the same name, leaving depmod with an + * unresolved-symbol warning and the loader unable to relocate the anchor. + * + * Declaring the section here gives lineinfo.o its own local SECTION + * symbol; ld -r merges sections by name so the local symbol simply + * relocates to offset 0 of the merged section (lineinfo.o is linked + * FIRST so its zero-byte contribution stays at the start). + */ +static void declare_empty_text_sections(void) +{ + for (unsigned int i =3D 0; i < ALL_SECTIONS; i++) { + if (!all_sections[i].present) + continue; + printf("\t.section %s, \"ax\"\n", all_sections[i].name); + } + printf("\n"); +} + +static void output_module_assembly(void) +{ + printf("/* SPDX-License-Identifier: GPL-2.0 */\n"); + printf("/*\n"); + printf(" * Automatically generated by scripts/gen_lineinfo --module\n"); + printf(" * Do not edit.\n"); + printf(" */\n\n"); + + declare_empty_text_sections(); +} + int main(int argc, char *argv[]) { int fd; @@ -493,12 +1120,23 @@ int main(int argc, char *argv[]) Dwarf *dwarf; unsigned long long text_addr; =20 + if (argc >=3D 2 && !strcmp(argv[1], "--module")) { + module_mode =3D 1; + argv++; + argc--; + } + if (argc !=3D 2) { - fprintf(stderr, "Usage: %s \n", argv[0]); + fprintf(stderr, "Usage: %s [--module] \n", argv[0]); return 1; } =20 - fd =3D open(argv[1], O_RDONLY); + /* + * For module mode, open O_RDWR so we can apply debug section + * relocations to the in-memory ELF data. The modifications + * are NOT written back to disk (no elf_update() call). + */ + fd =3D open(argv[1], module_mode ? O_RDWR : O_RDONLY); if (fd < 0) { fprintf(stderr, "Cannot open %s: %s\n", argv[1], strerror(errno)); @@ -506,7 +1144,7 @@ int main(int argc, char *argv[]) } =20 elf_version(EV_CURRENT); - elf =3D elf_begin(fd, ELF_C_READ, NULL); + elf =3D elf_begin(fd, module_mode ? ELF_C_RDWR : ELF_C_READ, NULL); if (!elf) { fprintf(stderr, "elf_begin failed: %s\n", elf_errmsg(elf_errno())); @@ -514,8 +1152,34 @@ int main(int argc, char *argv[]) return 1; } =20 - text_addr =3D find_text_addr(elf); - text_end_addr =3D find_text_end_addr(elf); + { + GElf_Ehdr ehdr; + + if (gelf_getehdr(elf, &ehdr) =3D=3D NULL) { + fprintf(stderr, "gelf_getehdr failed\n"); + elf_end(elf); + close(fd); + return 1; + } + target_64bit =3D (ehdr.e_ident[EI_CLASS] =3D=3D ELFCLASS64); + target_le =3D (ehdr.e_ident[EI_DATA] =3D=3D ELFDATA2LSB); + } + + if (module_mode) { + /* + * .ko files are ET_REL after ld -r. Resolve covered text + * sections FIRST so apply_debug_line_relocations() can use + * the assigned biases when patching line-program addresses; + * libdw does NOT apply relocations for ET_REL files, so we + * also handle DW_FORM_line_strp refs into .debug_line_str. + */ + resolve_covered_sections(elf, all_sections, ALL_SECTIONS); + apply_debug_line_relocations(elf); + text_addr =3D 0; /* unused in module mode */ + } else { + text_addr =3D find_text_addr(elf); + text_end_addr =3D find_text_end_addr(elf); + } =20 dwarf =3D dwarf_begin_elf(elf, DWARF_C_READ, NULL); if (!dwarf) { @@ -528,20 +1192,55 @@ int main(int argc, char *argv[]) return 1; } =20 - process_dwarf(dwarf, text_addr); + if (module_mode) { + unsigned int persistent_total, init_total; + + output_module_assembly(); /* file header only */ =20 - if (skipped_overflow) + /* + * Single DWARF pass classifies every line entry into its + * covering section (or skips it). Each entry is tagged with + * the master-array section_id so per-blob emit can filter. + */ + process_dwarf(dwarf, 0, all_sections, ALL_SECTIONS); + deduplicate(all_sections, ALL_SECTIONS); + compute_file_offsets(); + + emit_blob(".mod_lineinfo", "p", BLOB_PERSISTENT); + emit_blob(".init.mod_lineinfo", "i", BLOB_INIT); + + persistent_total =3D 0; + init_total =3D 0; + for (unsigned int i =3D 0; i < ALL_SECTIONS; i++) { + if (all_sections[i].blob =3D=3D BLOB_PERSISTENT) + persistent_total +=3D all_sections[i].n_entries; + else if (all_sections[i].blob =3D=3D BLOB_INIT) + init_total +=3D all_sections[i].n_entries; + } fprintf(stderr, - "lineinfo: warning: %u entries skipped (offset > 4 GiB from _text)\n", - skipped_overflow); + "lineinfo: persistent %u entries, init %u entries, %u files\n", + persistent_total, init_total, num_files); + + if (skipped_overflow) + fprintf(stderr, + "lineinfo: warning: %u entries skipped (offset > 4 GiB)\n", + skipped_overflow); + } else { + process_dwarf(dwarf, text_addr, NULL, 0); =20 - deduplicate(); - compute_file_offsets(); + if (skipped_overflow) + fprintf(stderr, + "lineinfo: warning: %u entries skipped (offset > 4 GiB from _text)\n", + skipped_overflow); =20 - fprintf(stderr, "lineinfo: %u entries, %u files\n", - num_entries, num_files); + deduplicate(NULL, 0); + compute_file_offsets(); =20 - output_assembly(); + fprintf(stderr, "lineinfo: %u entries, %u files\n", + num_entries, num_files); + + output_assembly(); + } =20 dwarf_end(dwarf); elf_end(elf); @@ -552,6 +1251,5 @@ int main(int argc, char *argv[]) for (unsigned int i =3D 0; i < num_files; i++) free(files[i].name); free(files); - return 0; } --=20 2.53.0 From nobody Wed Jun 10 11:10:56 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD08E3DF01B; Mon, 4 May 2026 15:35:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908901; cv=none; b=RZV04aQC+KF04uJ/CJtpNXfYo6d2lw8v/9ozabninv6RFYwHJvc8jVeGJtsIR/Fg6VogJXxrECkeU/yFgisgdd0modbODwVeHh7sYwyLyJioHlr5nOVlXdNMpjM13RDWr53JO7QOyy2Zj89Obh2/CsrrMnSrMGC1ueitd4YN4yE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908901; c=relaxed/simple; bh=NsIf4E4uywtFRePpMZ9CMPpaDKTrm8sYiaoMdmnNTvU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Gw0tVrP2tEDQpu65FiPUe7dIv5BQl9OvDXgPSFd3VLtGjg7AMORQkEmWsP1q31eW0M113HWbkdH1qxIVdEbmuuXGenEvvJFA/a7FefkQq6li9N+TlhM3bUP86OFzu5TMc17OSGISESOQ8CZ6hcdHdmOXqQfCFyVvvJTfHoVgcs0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PNBYpL4F; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PNBYpL4F" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 752C5C2BCF7; Mon, 4 May 2026 15:34:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777908901; bh=NsIf4E4uywtFRePpMZ9CMPpaDKTrm8sYiaoMdmnNTvU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PNBYpL4FF4el+RvzBwHNfclmKswMYo+JvN2LO1znZLdCFKgZF+5laBTKd6/k3oRYf Q04ZS6wrK7AUV0Jd2Ci5dYLPOenGMt3Rno625DhG3n29Nn+T9Q/c+WtKwg2gK0qbD4 I0qex8DhZUcIm4xPpoiuBNiQNn5qwWt2zcso2cWaS+OaKQtXNoD8w3vnkB8aoIpX/k meQXx5SdlMo2zpIHmKcjQ0z0rRwvd2oAiP3YD+AdBq7K944uQw4+fwZOCBqALQyExH DzCK7oTua+qIHL+J7YjzMDZQXDg/av/8kgBTsVUdJ68v0Ji/LvX1MVJ9FeJ0o/S3lK dPn1MEi972NNg== From: Sasha Levin To: Andrew Morton , Masahiro Yamada , Luis Chamberlain , Linus Torvalds , Richard Weinberger , Juergen Gross , Geert Uytterhoeven , James Bottomley Cc: Jonathan Corbet , Nathan Chancellor , Nicolas Schier , Petr Pavlu , Daniel Gomez , Greg KH , Petr Mladek , Steven Rostedt , Kees Cook , Peter Zijlstra , Thorsten Leemhuis , Vlastimil Babka , Helge Deller , Randy Dunlap , Laurent Pinchart , Vivian Wang , Zhen Lei , Sami Tolvanen , linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-modules@vger.kernel.org, linux-doc@vger.kernel.org, Sasha Levin Subject: [PATCH v5 3/4] kallsyms: delta-compress lineinfo tables for ~2.7x size reduction Date: Mon, 4 May 2026 11:33:59 -0400 Message-ID: <20260504153401.2416391-4-sashal@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260504153401.2416391-1-sashal@kernel.org> References: <20260504153401.2416391-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Replace the flat uncompressed parallel arrays (lineinfo_addrs[], lineinfo_file_ids[], lineinfo_lines[]) with a block-indexed, delta-encoded, ULEB128 varint compressed format. The sorted address array has small deltas between consecutive entries (typically 1-50 bytes), file IDs have high locality (delta often 0, same file), and line numbers change slowly. Delta-encoding followed by ULEB128 varint compression shrinks most values from 4 bytes to 1. Entries are grouped into blocks of 64. A small uncompressed block index (first addr + byte offset per block) enables O(log(N/64)) binary search, followed by sequential decode of at most 64 varints within the matching block. All decode state lives on the stack -- zero allocations, still safe for NMI/panic context. Measured on a defconfig+debug x86_64 build (3,017,154 entries, 4,822 source files, 47,144 blocks): Before (flat arrays): lineinfo_addrs[] 12,068,616 bytes (u32 x 3.0M) lineinfo_file_ids[] 6,034,308 bytes (u16 x 3.0M) lineinfo_lines[] 12,068,616 bytes (u32 x 3.0M) Total: 30,171,540 bytes (28.8 MiB, 10.0 bytes/entry) After (block-indexed delta + ULEB128): lineinfo_block_addrs[] 188,576 bytes (184 KiB) lineinfo_block_offsets[] 188,576 bytes (184 KiB) lineinfo_data[] 10,926,128 bytes (10.4 MiB) Total: 11,303,280 bytes (10.8 MiB, 3.7 bytes/entry) Savings: 18.0 MiB (2.7x reduction) Booted in QEMU and verified with SysRq-l that annotations still work: default_idle+0x9/0x10 (arch/x86/kernel/process.c:767) default_idle_call+0x6c/0xb0 (kernel/sched/idle.c:122) do_idle+0x335/0x490 (kernel/sched/idle.c:191) cpu_startup_entry+0x4e/0x60 (kernel/sched/idle.c:429) rest_init+0x1aa/0x1b0 (init/main.c:760) Suggested-by: Juergen Gross Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sasha Levin --- include/linux/mod_lineinfo.h | 236 ++++++++++++++++++++++++++++++----- kernel/kallsyms.c | 46 +++---- kernel/kallsyms_internal.h | 8 +- kernel/module/kallsyms.c | 106 +++++++--------- scripts/empty_lineinfo.S | 20 ++- scripts/gen_lineinfo.c | 185 ++++++++++++++++++--------- scripts/kallsyms.c | 7 +- 7 files changed, 420 insertions(+), 188 deletions(-) diff --git a/include/linux/mod_lineinfo.h b/include/linux/mod_lineinfo.h index 9cda3263a0784..a3c7143433020 100644 --- a/include/linux/mod_lineinfo.h +++ b/include/linux/mod_lineinfo.h @@ -3,9 +3,9 @@ * mod_lineinfo.h - Binary format for per-module source line information * * This header defines the layout of the .mod_lineinfo and - * .init.mod_lineinfo sections embedded in loadable kernel modules. It - * is dual-use: included from both the kernel and the userspace - * gen_lineinfo tool. + * .init.mod_lineinfo sections embedded in loadable kernel modules. It is + * dual-use: included from both the kernel and the userspace gen_lineinfo + * tool. * * Top-level layout (all values in target-native endianness): * @@ -20,16 +20,27 @@ * If the relocation fails to resolve (e.g. unknown reloc type), .anchor * stays zero and lookups silently degrade to "no annotation". * - * Each per-section sub-table is laid out as a stand-alone - * mod_lineinfo_header followed by parallel arrays: + * Each per-section sub-table is laid out exactly as a stand-alone + * mod_lineinfo_header followed by its arrays: * - * struct mod_lineinfo_header (16 bytes) - * u32 addrs[num_entries] -- offsets from this section's base, s= orted - * u16 file_ids[num_entries] -- parallel to addrs - * <2-byte pad if num_entries is odd> - * u32 lines[num_entries] -- parallel to addrs + * struct mod_lineinfo_header + * u32 block_addrs[num_blocks] -- first addr per block, for binary se= arch + * u32 block_offsets[num_blocks] -- byte offset into compressed data st= ream + * u8 data[data_size] -- LEB128 delta-compressed entries * u32 file_offsets[num_files] -- byte offset into filenames[] * char filenames[filenames_size] -- concatenated NUL-terminated strings + * + * Each sub-array is located by an explicit (offset, size) pair in the + * header, similar to a flattened devicetree. All offsets in the per-sect= ion + * header are relative to that header itself, so a sub-table is fully + * self-describing. + * + * Compressed stream format (per block of LINEINFO_BLOCK_ENTRIES entries): + * Entry 0: file_id (ULEB128), line (ULEB128) + * addr is in block_addrs[] + * Entry 1..N: addr_delta (ULEB128), + * file_id_delta (SLEB128), + * line_delta (SLEB128) */ #ifndef _LINUX_MOD_LINEINFO_H #define _LINUX_MOD_LINEINFO_H @@ -40,9 +51,12 @@ #include typedef uint32_t u32; typedef uint16_t u16; +typedef uint8_t u8; typedef uint64_t u64; #endif =20 +#define LINEINFO_BLOCK_ENTRIES 64 + /* * Per-section descriptor. One entry per ELF text section covered by the * blob (.text, .exit.text, .init.text, ...). @@ -66,39 +80,201 @@ struct mod_lineinfo_root { =20 struct mod_lineinfo_header { u32 num_entries; + u32 num_blocks; u32 num_files; - u32 filenames_size; /* total bytes of concatenated filenames */ + u32 blocks_offset; /* offset to block_addrs[] from this header */ + u32 blocks_size; /* bytes: num_blocks * 2 * sizeof(u32) */ + u32 data_offset; /* offset to compressed stream */ + u32 data_size; /* bytes of compressed data */ + u32 files_offset; /* offset to file_offsets[] */ + u32 files_size; /* bytes: num_files * sizeof(u32) */ + u32 filenames_offset; + u32 filenames_size; }; =20 -/* Offset helpers: compute byte offset from the per-section header to each= array. */ +/* + * Descriptor for a lineinfo table, used by the shared lookup function. + * Callers populate this from either linker globals (vmlinux) or a + * validated mod_lineinfo_header (modules). + */ +struct lineinfo_table { + const u32 *blk_addrs; + const u32 *blk_offsets; + const u8 *data; + u32 data_size; + const u32 *file_offsets; + const char *filenames; + u32 num_entries; + u32 num_blocks; + u32 num_files; + u32 filenames_size; +}; =20 -static inline u32 mod_lineinfo_addrs_off(void) +/* + * Read a ULEB128 varint from a byte stream. + * Returns the decoded value and advances *pos past the encoded bytes. + * If *pos would exceed 'end', returns 0 and sets *pos =3D end (safe for + * NMI/panic context: no crash, just a missed annotation). + */ +static inline u32 lineinfo_read_uleb128(const u8 *data, u32 *pos, u32 end) { - return sizeof(struct mod_lineinfo_header); -} + u32 result =3D 0; + unsigned int shift =3D 0; =20 -static inline u32 mod_lineinfo_file_ids_off(u32 num_entries) -{ - return mod_lineinfo_addrs_off() + num_entries * sizeof(u32); + while (*pos < end) { + u8 byte =3D data[*pos]; + (*pos)++; + result |=3D (u32)(byte & 0x7f) << shift; + if (!(byte & 0x80)) + return result; + shift +=3D 7; + if (shift >=3D 32) { + /* Malformed: skip remaining continuation bytes */ + while (*pos < end && (data[*pos] & 0x80)) + (*pos)++; + if (*pos < end) + (*pos)++; + return result; + } + } + return result; } =20 -static inline u32 mod_lineinfo_lines_off(u32 num_entries) +/* Read an SLEB128 varint. Same safety guarantees as above. */ +static inline int32_t lineinfo_read_sleb128(const u8 *data, u32 *pos, u32 = end) { - /* u16 file_ids[] may need 2-byte padding to align lines[] to 4 bytes */ - u32 off =3D mod_lineinfo_file_ids_off(num_entries) + - num_entries * sizeof(u16); - return (off + 3) & ~3u; -} + int32_t result =3D 0; + unsigned int shift =3D 0; + u8 byte =3D 0; =20 -static inline u32 mod_lineinfo_file_offsets_off(u32 num_entries) -{ - return mod_lineinfo_lines_off(num_entries) + num_entries * sizeof(u32); + while (*pos < end) { + byte =3D data[*pos]; + (*pos)++; + result |=3D (int32_t)(byte & 0x7f) << shift; + shift +=3D 7; + if (!(byte & 0x80)) + break; + if (shift >=3D 32) { + while (*pos < end && (data[*pos] & 0x80)) + (*pos)++; + if (*pos < end) + (*pos)++; + return result; + } + } + + /* Sign-extend if the high bit of the last byte was set */ + if (shift < 32 && (byte & 0x40)) + result |=3D -(1 << shift); + + return result; } =20 -static inline u32 mod_lineinfo_filenames_off(u32 num_entries, u32 num_file= s) +/* + * Search a lineinfo table for the source file and line corresponding to a + * given offset (from _text for vmlinux, from .text base for modules). + * + * Safe for NMI and panic context: no locks, no allocations, all state on = stack. + * Returns true and sets @file and @line on success; false on any failure. + */ +static inline bool lineinfo_search(const struct lineinfo_table *tbl, + unsigned int offset, + const char **file, unsigned int *line) { - return mod_lineinfo_file_offsets_off(num_entries) + - num_files * sizeof(u32); + unsigned int low, high, mid, block; + unsigned int cur_addr, cur_file_id, cur_line; + unsigned int best_file_id =3D 0, best_line =3D 0; + unsigned int block_entries, data_end; + bool found =3D false; + u32 pos; + + if (!tbl->num_entries || !tbl->num_blocks) + return false; + + /* Binary search on blk_addrs[] to find the right block */ + low =3D 0; + high =3D tbl->num_blocks; + while (low < high) { + mid =3D low + (high - low) / 2; + if (tbl->blk_addrs[mid] <=3D offset) + low =3D mid + 1; + else + high =3D mid; + } + + if (low =3D=3D 0) + return false; + block =3D low - 1; + + /* How many entries in this block? */ + block_entries =3D LINEINFO_BLOCK_ENTRIES; + if (block =3D=3D tbl->num_blocks - 1) { + unsigned int remaining =3D tbl->num_entries - + block * LINEINFO_BLOCK_ENTRIES; + + if (remaining < block_entries) + block_entries =3D remaining; + } + + /* Determine end of this block's data in the compressed stream */ + if (block + 1 < tbl->num_blocks) + data_end =3D tbl->blk_offsets[block + 1]; + else + data_end =3D tbl->data_size; + + /* Clamp data_end to actual data size */ + if (data_end > tbl->data_size) + data_end =3D tbl->data_size; + + /* Decode entry 0: addr from blk_addrs, file_id and line from stream */ + pos =3D tbl->blk_offsets[block]; + if (pos >=3D data_end) + return false; + + cur_addr =3D tbl->blk_addrs[block]; + cur_file_id =3D lineinfo_read_uleb128(tbl->data, &pos, data_end); + cur_line =3D lineinfo_read_uleb128(tbl->data, &pos, data_end); + + /* Check entry 0 */ + if (cur_addr <=3D offset) { + best_file_id =3D cur_file_id; + best_line =3D cur_line; + found =3D true; + } + + /* Decode entries 1..N */ + for (unsigned int i =3D 1; i < block_entries; i++) { + unsigned int addr_delta; + int32_t file_delta, line_delta; + + addr_delta =3D lineinfo_read_uleb128(tbl->data, &pos, data_end); + file_delta =3D lineinfo_read_sleb128(tbl->data, &pos, data_end); + line_delta =3D lineinfo_read_sleb128(tbl->data, &pos, data_end); + + cur_addr +=3D addr_delta; + cur_file_id =3D (unsigned int)((int32_t)cur_file_id + file_delta); + cur_line =3D (unsigned int)((int32_t)cur_line + line_delta); + + if (cur_addr > offset) + break; + + best_file_id =3D cur_file_id; + best_line =3D cur_line; + found =3D true; + } + + if (!found) + return false; + + if (best_file_id >=3D tbl->num_files) + return false; + + if (tbl->file_offsets[best_file_id] >=3D tbl->filenames_size) + return false; + + *file =3D &tbl->filenames[tbl->file_offsets[best_file_id]]; + *line =3D best_line; + return true; } =20 #endif /* _LINUX_MOD_LINEINFO_H */ diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index d95387f51b4c0..1f58b4123a8ae 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -467,13 +467,16 @@ static int append_buildid(char *buffer, const char = *modname, =20 #endif /* CONFIG_STACKTRACE_BUILD_ID */ =20 +#include + bool kallsyms_lookup_lineinfo(unsigned long addr, const char **file, unsigned int *line) { + struct lineinfo_table tbl; unsigned long long raw_offset; - unsigned int offset, low, high, mid, file_id; =20 - if (!IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) || !lineinfo_num_entries) + if (!IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) || + !lineinfo_num_entries || !lineinfo_num_blocks) return false; =20 /* Compute offset from _text */ @@ -483,34 +486,19 @@ bool kallsyms_lookup_lineinfo(unsigned long addr, raw_offset =3D addr - (unsigned long)_text; if (raw_offset > UINT_MAX) return false; - offset =3D (unsigned int)raw_offset; - - /* Binary search for largest entry <=3D offset */ - low =3D 0; - high =3D lineinfo_num_entries; - while (low < high) { - mid =3D low + (high - low) / 2; - if (lineinfo_addrs[mid] <=3D offset) - low =3D mid + 1; - else - high =3D mid; - } - - if (low =3D=3D 0) - return false; - low--; - - file_id =3D lineinfo_file_ids[low]; - *line =3D lineinfo_lines[low]; - - if (file_id >=3D lineinfo_num_files) - return false; - - if (lineinfo_file_offsets[file_id] >=3D lineinfo_filenames_size) - return false; =20 - *file =3D &lineinfo_filenames[lineinfo_file_offsets[file_id]]; - return true; + tbl.blk_addrs =3D lineinfo_block_addrs; + tbl.blk_offsets =3D lineinfo_block_offsets; + tbl.data =3D lineinfo_data; + tbl.data_size =3D lineinfo_data_size; + tbl.file_offsets =3D lineinfo_file_offsets; + tbl.filenames =3D lineinfo_filenames; + tbl.num_entries =3D lineinfo_num_entries; + tbl.num_blocks =3D lineinfo_num_blocks; + tbl.num_files =3D lineinfo_num_files; + tbl.filenames_size =3D lineinfo_filenames_size; + + return lineinfo_search(&tbl, (unsigned int)raw_offset, file, line); } =20 /* Look up a kernel symbol and return it in a text buffer. */ diff --git a/kernel/kallsyms_internal.h b/kernel/kallsyms_internal.h index d7374ce444d81..ffe4c658067ec 100644 --- a/kernel/kallsyms_internal.h +++ b/kernel/kallsyms_internal.h @@ -16,10 +16,12 @@ extern const unsigned int kallsyms_markers[]; extern const u8 kallsyms_seqs_of_names[]; =20 extern const u32 lineinfo_num_entries; -extern const u32 lineinfo_addrs[]; -extern const u16 lineinfo_file_ids[]; -extern const u32 lineinfo_lines[]; extern const u32 lineinfo_num_files; +extern const u32 lineinfo_num_blocks; +extern const u32 lineinfo_block_addrs[]; +extern const u32 lineinfo_block_offsets[]; +extern const u32 lineinfo_data_size; +extern const u8 lineinfo_data[]; extern const u32 lineinfo_file_offsets[]; extern const u32 lineinfo_filenames_size; extern const char lineinfo_filenames[]; diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c index 819d6594c2937..2bec9f0e6afc5 100644 --- a/kernel/module/kallsyms.c +++ b/kernel/module/kallsyms.c @@ -498,9 +498,9 @@ int module_kallsyms_on_each_symbol(const char *modname, #include =20 /* - * Search one per-section sub-table for @section_offset using flat parallel - * arrays. @hdr is the per-section header at byte offset @hdr_offset with= in - * @blob. Returns true on hit and populates @file / @line. + * Search one per-section sub-table for @section_offset. + * @hdr is the per-section header at byte offset @hdr_offset within @blob. + * Returns true on hit and populates @file / @line. */ static bool module_lookup_lineinfo_section(const void *blob, u32 blob_size, u32 hdr_offset, @@ -509,85 +509,71 @@ static bool module_lookup_lineinfo_section(const void= *blob, u32 blob_size, unsigned int *line) { const struct mod_lineinfo_header *hdr; - const u8 *base; - const u32 *addrs, *lines, *file_offsets; - const u16 *file_ids; - const char *filenames; - u32 num_entries, num_files, filenames_size; - unsigned int low, high, mid; - u16 file_id; + struct lineinfo_table tbl; + const void *base; =20 if (hdr_offset > blob_size || blob_size - hdr_offset < sizeof(*hdr)) return false; =20 base =3D (const u8 *)blob + hdr_offset; - hdr =3D (const struct mod_lineinfo_header *)base; - num_entries =3D hdr->num_entries; - num_files =3D hdr->num_files; - filenames_size =3D hdr->filenames_size; + hdr =3D base; =20 - if (num_entries =3D=3D 0) + if (hdr->num_entries =3D=3D 0 || hdr->num_blocks =3D=3D 0) return false; =20 - /* - * Validate counts before multiplying =E2=80=94 sizing arithmetic could - * otherwise overflow on 32-bit with a malformed blob. Each entry - * contributes one u32 (addrs), one u16 (file_ids), and one u32 - * (lines); each file contributes one u32 (file_offsets). - */ + /* Validate each sub-array fits within the remaining blob bytes */ { u32 avail =3D blob_size - hdr_offset; - u32 needed =3D mod_lineinfo_filenames_off(num_entries, num_files); =20 - if (num_entries > U32_MAX / sizeof(u32)) + if (hdr->blocks_offset > avail || + hdr->blocks_size > avail - hdr->blocks_offset) + return false; + if (hdr->data_offset > avail || + hdr->data_size > avail - hdr->data_offset) return false; - if (num_files > U32_MAX / sizeof(u32)) + if (hdr->files_offset > avail || + hdr->files_size > avail - hdr->files_offset) return false; - if (needed > avail || filenames_size > avail - needed) + if (hdr->filenames_offset > avail || + hdr->filenames_size > avail - hdr->filenames_offset) return false; } =20 /* - * Filenames are read as NUL-terminated C strings. Require the blob - * to end in NUL so a malformed file_offsets entry can never lead the - * later "%s" consumer past the end of the section. + * Validate counts before multiplying by element size =E2=80=94 multiplic= ation + * could otherwise overflow on 32-bit builds with a malformed blob. + * num_blocks contributes (addr,offset) u32 pairs; num_files contributes + * one u32 each. */ - if (filenames_size =3D=3D 0 || - base[mod_lineinfo_filenames_off(num_entries, num_files) + - filenames_size - 1] !=3D 0) + if (hdr->num_blocks > hdr->blocks_size / (2 * sizeof(u32))) return false; - - addrs =3D (const u32 *)(base + mod_lineinfo_addrs_off()); - file_ids =3D (const u16 *)(base + mod_lineinfo_file_ids_off(num_entries)); - lines =3D (const u32 *)(base + mod_lineinfo_lines_off(num_entries)); - file_offsets =3D (const u32 *)(base + mod_lineinfo_file_offsets_off(num_e= ntries)); - filenames =3D (const char *)(base + mod_lineinfo_filenames_off(num_entrie= s, num_files)); - - /* Binary search for largest entry <=3D section_offset. */ - low =3D 0; - high =3D num_entries; - while (low < high) { - mid =3D low + (high - low) / 2; - if (addrs[mid] <=3D section_offset) - low =3D mid + 1; - else - high =3D mid; - } - - if (low =3D=3D 0) + if (hdr->num_files > hdr->files_size / sizeof(u32)) return false; - low--; =20 - file_id =3D file_ids[low]; - if (file_id >=3D num_files) - return false; - if (file_offsets[file_id] >=3D filenames_size) + /* + * Filenames are read as NUL-terminated C strings. Require the blob + * to end in NUL so a malformed file_offsets entry can never lead the + * later "%s" consumer past the end of the section. + */ + if (hdr->filenames_size =3D=3D 0 || + ((const u8 *)base)[hdr->filenames_offset + + hdr->filenames_size - 1] !=3D 0) return false; =20 - *file =3D &filenames[file_offsets[file_id]]; - *line =3D lines[low]; - return true; + tbl.blk_addrs =3D base + hdr->blocks_offset; + tbl.blk_offsets =3D base + hdr->blocks_offset + + hdr->num_blocks * sizeof(u32); + tbl.data =3D base + hdr->data_offset; + tbl.data_size =3D hdr->data_size; + tbl.file_offsets =3D base + hdr->files_offset; + tbl.filenames =3D base + hdr->filenames_offset; + tbl.num_entries =3D hdr->num_entries; + tbl.num_blocks =3D hdr->num_blocks; + tbl.num_files =3D hdr->num_files; + tbl.filenames_size =3D hdr->filenames_size; + + return lineinfo_search(&tbl, section_offset, file, line); } =20 /* @@ -609,6 +595,7 @@ static bool module_lookup_lineinfo_blob(const void *blo= b, u32 blob_size, if (root->num_sections =3D=3D 0) return false; =20 + /* Validate sections[] array fits within the blob */ if (root->num_sections > U32_MAX / sizeof(struct mod_lineinfo_section)) return false; sections_end =3D sizeof(*root) + @@ -642,6 +629,9 @@ static bool module_lookup_lineinfo_blob(const void *blo= b, u32 blob_size, =20 /* * Look up source file:line for an address within a loaded module. + * Uses the .mod_lineinfo / .init.mod_lineinfo sections embedded in the .ko + * at build time. Each section contains one or more per-section sub-tables + * keyed by an ELF-relocation-resolved anchor. * * Safe in NMI/panic context: no locks, no allocations. * Caller must hold RCU read lock (or be in a context where the module diff --git a/scripts/empty_lineinfo.S b/scripts/empty_lineinfo.S index e058c41137123..edd5b1092f050 100644 --- a/scripts/empty_lineinfo.S +++ b/scripts/empty_lineinfo.S @@ -14,12 +14,20 @@ lineinfo_num_entries: .balign 4 lineinfo_num_files: .long 0 - .globl lineinfo_addrs -lineinfo_addrs: - .globl lineinfo_file_ids -lineinfo_file_ids: - .globl lineinfo_lines -lineinfo_lines: + .globl lineinfo_num_blocks + .balign 4 +lineinfo_num_blocks: + .long 0 + .globl lineinfo_block_addrs +lineinfo_block_addrs: + .globl lineinfo_block_offsets +lineinfo_block_offsets: + .globl lineinfo_data_size + .balign 4 +lineinfo_data_size: + .long 0 + .globl lineinfo_data +lineinfo_data: .globl lineinfo_file_offsets lineinfo_file_offsets: .globl lineinfo_filenames_size diff --git a/scripts/gen_lineinfo.c b/scripts/gen_lineinfo.c index e1e08469b4f2f..394690a23a2f7 100644 --- a/scripts/gen_lineinfo.c +++ b/scripts/gen_lineinfo.c @@ -825,6 +825,45 @@ static void deduplicate(struct covered_section *sectio= ns, } } =20 +/* + * Emit the LEB128 delta-compressed data stream for one block. + * @base is the absolute index of the first entry, @count is the number of + * entries in this block (<=3D LINEINFO_BLOCK_ENTRIES). Used by both vmli= nux + * mode (one section, full entries[]) and module mode (per-section ranges). + */ +static void emit_block_data_range(unsigned int base, unsigned int count) +{ + if (!count) + return; + + /* Entry 0: file_id, line (both unsigned) */ + printf("\t.uleb128 %u\n", entries[base].file_id); + printf("\t.uleb128 %u\n", entries[base].line); + + /* Entries 1..N: addr_delta (unsigned), file/line deltas (signed) */ + for (unsigned int i =3D 1; i < count; i++) { + unsigned int idx =3D base + i; + + printf("\t.uleb128 %u\n", + entries[idx].offset - entries[idx - 1].offset); + printf("\t.sleb128 %d\n", + (int)entries[idx].file_id - (int)entries[idx - 1].file_id); + printf("\t.sleb128 %d\n", + (int)entries[idx].line - (int)entries[idx - 1].line); + } +} + +/* Vmlinux-mode wrapper: pick block index out of the global entries[]. */ +static void emit_block_data(unsigned int block) +{ + unsigned int base =3D block * LINEINFO_BLOCK_ENTRIES; + unsigned int count =3D num_entries - base; + + if (count > LINEINFO_BLOCK_ENTRIES) + count =3D LINEINFO_BLOCK_ENTRIES; + emit_block_data_range(base, count); +} + static void compute_file_offsets(void) { unsigned int offset =3D 0; @@ -848,6 +887,11 @@ static void print_escaped_asciz(const char *s) =20 static void output_assembly(void) { + unsigned int num_blocks; + + num_blocks =3D num_entries ? + (num_entries + LINEINFO_BLOCK_ENTRIES - 1) / LINEINFO_BLOCK_ENTRIES : 0; + printf("/* SPDX-License-Identifier: GPL-2.0 */\n"); printf("/*\n"); printf(" * Automatically generated by scripts/gen_lineinfo\n"); @@ -868,29 +912,40 @@ static void output_assembly(void) printf("lineinfo_num_files:\n"); printf("\t.long %u\n\n", num_files); =20 - /* Sorted address offsets from _text */ - printf("\t.globl lineinfo_addrs\n"); + /* Number of blocks */ + printf("\t.globl lineinfo_num_blocks\n"); printf("\t.balign 4\n"); - printf("lineinfo_addrs:\n"); - for (unsigned int i =3D 0; i < num_entries; i++) - printf("\t.long 0x%x\n", entries[i].offset); - printf("\n"); + printf("lineinfo_num_blocks:\n"); + printf("\t.long %u\n\n", num_blocks); =20 - /* File IDs, parallel to addrs (u16 -- supports up to 65535 files) */ - printf("\t.globl lineinfo_file_ids\n"); - printf("\t.balign 2\n"); - printf("lineinfo_file_ids:\n"); - for (unsigned int i =3D 0; i < num_entries; i++) - printf("\t.short %u\n", entries[i].file_id); - printf("\n"); + /* Block first-addresses for binary search */ + printf("\t.globl lineinfo_block_addrs\n"); + printf("\t.balign 4\n"); + printf("lineinfo_block_addrs:\n"); + for (unsigned int i =3D 0; i < num_blocks; i++) + printf("\t.long 0x%x\n", entries[i * LINEINFO_BLOCK_ENTRIES].offset); =20 - /* Line numbers, parallel to addrs */ - printf("\t.globl lineinfo_lines\n"); + /* Block byte offsets into compressed stream */ + printf("\t.globl lineinfo_block_offsets\n"); printf("\t.balign 4\n"); - printf("lineinfo_lines:\n"); - for (unsigned int i =3D 0; i < num_entries; i++) - printf("\t.long %u\n", entries[i].line); - printf("\n"); + printf("lineinfo_block_offsets:\n"); + for (unsigned int i =3D 0; i < num_blocks; i++) + printf("\t.long .Lblock_%u - lineinfo_data\n", i); + + /* Compressed data size */ + printf("\t.globl lineinfo_data_size\n"); + printf("\t.balign 4\n"); + printf("lineinfo_data_size:\n"); + printf("\t.long .Ldata_end - lineinfo_data\n\n"); + + /* Compressed data stream */ + printf("\t.globl lineinfo_data\n"); + printf("lineinfo_data:\n"); + for (unsigned int i =3D 0; i < num_blocks; i++) { + printf(".Lblock_%u:\n", i); + emit_block_data(i); + } + printf(".Ldata_end:\n\n"); =20 /* File string offset table */ printf("\t.globl lineinfo_file_offsets\n"); @@ -898,71 +953,81 @@ static void output_assembly(void) printf("lineinfo_file_offsets:\n"); for (unsigned int i =3D 0; i < num_files; i++) printf("\t.long %u\n", files[i].str_offset); - printf("\n"); =20 /* Filenames size */ - { - unsigned int fsize =3D 0; - - for (unsigned int i =3D 0; i < num_files; i++) - fsize +=3D strlen(files[i].name) + 1; - printf("\t.globl lineinfo_filenames_size\n"); - printf("\t.balign 4\n"); - printf("lineinfo_filenames_size:\n"); - printf("\t.long %u\n\n", fsize); - } + printf("\t.globl lineinfo_filenames_size\n"); + printf("\t.balign 4\n"); + printf("lineinfo_filenames_size:\n"); + printf("\t.long .Lfilenames_end - lineinfo_filenames\n\n"); =20 /* Concatenated NUL-terminated filenames */ printf("\t.globl lineinfo_filenames\n"); printf("lineinfo_filenames:\n"); for (unsigned int i =3D 0; i < num_files; i++) print_escaped_asciz(files[i].name); - printf("\n"); + printf(".Lfilenames_end:\n"); } =20 /* - * Emit one per-section table in the simple flat-array layout: + * Emit one per-section table. @suffix uniquifies the local labels so + * multiple tables can coexist in a single output blob; @blob_root_label + * is the symbol for the start of the enclosing blob (used for + * table_offset =3D .Lhdr - .Lroot). * - * mod_lineinfo_header - * addrs[count] (u32, sorted) - * file_ids[count] (u16) + 2-byte pad if count is odd - * lines[count] (u32) - * file_offsets[] (u32) - * filenames[] - * - * @suffix uniquifies labels so multiple tables can coexist in one blob. - * Caller has sorted entries[] so this section's entries occupy [first, - * first + count). + * Caller has already sorted entries[] so this section's entries occupy + * the contiguous range [first, first + count). This function emits + * block-relative addresses computed from entries[first + N].offset. */ static void emit_section_table(unsigned int first, unsigned int count, const char *suffix) { + unsigned int num_blocks; + + num_blocks =3D count ? + (count + LINEINFO_BLOCK_ENTRIES - 1) / LINEINFO_BLOCK_ENTRIES : 0; + printf(".Lhdr%s:\n", suffix); printf("\t.balign 4\n"); printf("\t.long %u\t\t/* num_entries */\n", count); + printf("\t.long %u\t\t/* num_blocks */\n", num_blocks); printf("\t.long %u\t\t/* num_files */\n", num_files); + printf("\t.long .Lblk_addrs%s - .Lhdr%s\n", suffix, suffix); + printf("\t.long .Lblk_offsets_end%s - .Lblk_addrs%s\n", suffix, suffix); + printf("\t.long .Ldata%s - .Lhdr%s\n", suffix, suffix); + printf("\t.long .Ldata_end%s - .Ldata%s\n", suffix, suffix); + printf("\t.long .Lfile_offsets%s - .Lhdr%s\n", suffix, suffix); + printf("\t.long .Lfile_offsets_end%s - .Lfile_offsets%s\n", suffix, suffi= x); + printf("\t.long .Lfilenames%s - .Lhdr%s\n", suffix, suffix); printf("\t.long .Lfilenames_end%s - .Lfilenames%s\n\n", suffix, suffix); =20 - /* addrs[] */ - for (unsigned int i =3D 0; i < count; i++) - printf("\t.long 0x%x\n", entries[first + i].offset); - - /* file_ids[] */ - for (unsigned int i =3D 0; i < count; i++) - printf("\t.short %u\n", entries[first + i].file_id); - if (count & 1) - printf("\t.short 0\t\t/* pad to align lines[] */\n"); - - /* lines[] */ - for (unsigned int i =3D 0; i < count; i++) - printf("\t.long %u\n", entries[first + i].line); + printf(".Lblk_addrs%s:\n", suffix); + for (unsigned int i =3D 0; i < num_blocks; i++) + printf("\t.long 0x%x\n", + entries[first + i * LINEINFO_BLOCK_ENTRIES].offset); + + printf(".Lblk_offsets%s:\n", suffix); + for (unsigned int i =3D 0; i < num_blocks; i++) + printf("\t.long .Lblock%s_%u - .Ldata%s\n", suffix, i, suffix); + printf(".Lblk_offsets_end%s:\n\n", suffix); + + printf(".Ldata%s:\n", suffix); + for (unsigned int i =3D 0; i < num_blocks; i++) { + unsigned int base =3D first + i * LINEINFO_BLOCK_ENTRIES; + unsigned int n =3D count - i * LINEINFO_BLOCK_ENTRIES; + + if (n > LINEINFO_BLOCK_ENTRIES) + n =3D LINEINFO_BLOCK_ENTRIES; + printf(".Lblock%s_%u:\n", suffix, i); + emit_block_data_range(base, n); + } + printf(".Ldata_end%s:\n", suffix); =20 - /* file_offsets[] */ printf("\t.balign 4\n"); + printf(".Lfile_offsets%s:\n", suffix); for (unsigned int i =3D 0; i < num_files; i++) printf("\t.long %u\n", files[i].str_offset); + printf(".Lfile_offsets_end%s:\n\n", suffix); =20 - /* filenames[] */ printf(".Lfilenames%s:\n", suffix); for (unsigned int i =3D 0; i < num_files; i++) print_escaped_asciz(files[i].name); @@ -1236,8 +1301,10 @@ int main(int argc, char *argv[]) deduplicate(NULL, 0); compute_file_offsets(); =20 - fprintf(stderr, "lineinfo: %u entries, %u files\n", - num_entries, num_files); + fprintf(stderr, "lineinfo: %u entries, %u files, %u blocks\n", + num_entries, num_files, + num_entries ? + (num_entries + LINEINFO_BLOCK_ENTRIES - 1) / LINEINFO_BLOCK_ENTRIES : 0= ); =20 output_assembly(); } diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 42662c4fbc6c9..94fbdad3df7c6 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -80,11 +80,12 @@ static bool is_ignored_symbol(const char *name, char ty= pe) { /* Ignore lineinfo symbols for kallsyms pass stability */ static const char * const lineinfo_syms[] =3D { - "lineinfo_addrs", - "lineinfo_file_ids", + "lineinfo_block_addrs", + "lineinfo_block_offsets", + "lineinfo_data", "lineinfo_file_offsets", "lineinfo_filenames", - "lineinfo_lines", + "lineinfo_num_blocks", "lineinfo_num_entries", "lineinfo_num_files", }; --=20 2.53.0 From nobody Wed Jun 10 11:10:56 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D4B513E0C40; Mon, 4 May 2026 15:35:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908908; cv=none; b=nki2AUCgZx4mTsr5WCvPVYYMTqFYdsS5yDSM5k9CGbztY0Kmzy3Trzg+RoER/43Om4GmPEMK1bfsxzSN3AgrKyY76v3Z4AulJUHLPsr/cF6RH1A8DTCysHQTiUPSnIV8LQi2RpvQJ0BMj45345bPHhI7mTJvv/p1qVZ8Kef0m20= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777908908; c=relaxed/simple; bh=IpDqim+44hzdzhvSH672i56NCD3F+SoXIE8/r5DRmYo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QgdMzqThpCRLLMzxpur8024dkKwbRD/dKveHNwEsLAmjHBBVPnVEMMjwKLnGi21lGwjn/Uj6rTwt54/2AaBZX/M1c57yjSGAqQJq4KIMth6/ZgMS3wbBbysGrYlg2lcq4nbcKhp52CVG63rGWY1VqKb87v2SRxcdo+3yWMIj+YI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=B7MqVeV7; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="B7MqVeV7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3F086C2BCB8; Mon, 4 May 2026 15:35:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777908908; bh=IpDqim+44hzdzhvSH672i56NCD3F+SoXIE8/r5DRmYo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=B7MqVeV7CthPhud78kC1vEJ0jMrCCerujCctd+WG7q+Tc4wI+cWGd8t7mt4TFdaMq 66E5snIGwsJHemqVcGDjIoA5jkmBiYEHGkI0CV4EtcMgatP3GBZ7d21ETmFiQsbauV aSQfFEtSNmkZwzhvIE0NvWnNimQVjIjx5vD4+8yeskaFh1lo2yd6lFTNGqHJSKzo8E 2sdNvIndHl9bY7yPzQGz/Q/oLQqpQu5xGCeJGUBocLV2LQM2P0zjkkJ5KDEUf7XXb+ hXAWe5g5HDdNeTk2Mpu5Y1WqX5yFiT7E3c93GC4kbrEvjFtC6J7WYYFbkgMipAZF8c EozMFG0bfwEKg== From: Sasha Levin To: Andrew Morton , Masahiro Yamada , Luis Chamberlain , Linus Torvalds , Richard Weinberger , Juergen Gross , Geert Uytterhoeven , James Bottomley Cc: Jonathan Corbet , Nathan Chancellor , Nicolas Schier , Petr Pavlu , Daniel Gomez , Greg KH , Petr Mladek , Steven Rostedt , Kees Cook , Peter Zijlstra , Thorsten Leemhuis , Vlastimil Babka , Helge Deller , Randy Dunlap , Laurent Pinchart , Vivian Wang , Zhen Lei , Sami Tolvanen , linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-modules@vger.kernel.org, linux-doc@vger.kernel.org, Sasha Levin Subject: [PATCH v5 4/4] kallsyms: add KUnit tests for lineinfo feature Date: Mon, 4 May 2026 11:34:00 -0400 Message-ID: <20260504153401.2416391-5-sashal@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260504153401.2416391-1-sashal@kernel.org> References: <20260504153401.2416391-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add a KUnit test module (CONFIG_LINEINFO_KUNIT_TEST) that verifies the kallsyms lineinfo feature produces correct source file:line annotations in stack traces. Export sprint_backtrace() and sprint_backtrace_build_id() as GPL symbols so the test module can exercise the backtrace APIs. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sasha Levin --- kernel/kallsyms.c | 2 + lib/Kconfig.debug | 10 + lib/tests/Makefile | 3 + lib/tests/lineinfo_kunit.c | 855 +++++++++++++++++++++++++++++++++++++ 4 files changed, 870 insertions(+) create mode 100644 lib/tests/lineinfo_kunit.c diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index 1f58b4123a8ae..6ac2786cdcbcb 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -633,6 +633,7 @@ int sprint_backtrace(char *buffer, unsigned long addres= s) { return __sprint_symbol(buffer, address, -1, 1, 0, 1); } +EXPORT_SYMBOL_GPL(sprint_backtrace); =20 /** * sprint_backtrace_build_id - Look up a backtrace symbol and return it in= a text buffer @@ -653,6 +654,7 @@ int sprint_backtrace_build_id(char *buffer, unsigned lo= ng address) { return __sprint_symbol(buffer, address, -1, 1, 1, 1); } +EXPORT_SYMBOL_GPL(sprint_backtrace_build_id); =20 /* To avoid using get_symbol_offset for every symbol, we carry prefix alon= g. */ struct kallsym_iter { diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 8ff5adcfe1e0a..27cb92fd131ad 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -3077,6 +3077,16 @@ config LONGEST_SYM_KUNIT_TEST =20 If unsure, say N. =20 +config LINEINFO_KUNIT_TEST + tristate "KUnit tests for kallsyms lineinfo" if !KUNIT_ALL_TESTS + depends on KUNIT && KALLSYMS_LINEINFO + default KUNIT_ALL_TESTS + help + KUnit tests for the kallsyms source line info feature. + Verifies that stack traces include correct (file.c:line) annotations. + + If unsure, say N. + config HW_BREAKPOINT_KUNIT_TEST bool "Test hw_breakpoint constraints accounting" if !KUNIT_ALL_TESTS depends on HAVE_HW_BREAKPOINT diff --git a/lib/tests/Makefile b/lib/tests/Makefile index 7e9c2fa52e35a..3a0100338c160 100644 --- a/lib/tests/Makefile +++ b/lib/tests/Makefile @@ -36,6 +36,9 @@ obj-$(CONFIG_LIVEUPDATE_TEST) +=3D liveupdate.o CFLAGS_longest_symbol_kunit.o +=3D $(call cc-disable-warning, missing-prot= otypes) obj-$(CONFIG_LONGEST_SYM_KUNIT_TEST) +=3D longest_symbol_kunit.o =20 +CFLAGS_lineinfo_kunit.o +=3D $(call cc-option,-fno-inline-functions-called= -once) +obj-$(CONFIG_LINEINFO_KUNIT_TEST) +=3D lineinfo_kunit.o + obj-$(CONFIG_MEMCPY_KUNIT_TEST) +=3D memcpy_kunit.o obj-$(CONFIG_MIN_HEAP_KUNIT_TEST) +=3D min_heap_kunit.o CFLAGS_overflow_kunit.o =3D $(call cc-disable-warning, tautological-consta= nt-out-of-range-compare) diff --git a/lib/tests/lineinfo_kunit.c b/lib/tests/lineinfo_kunit.c new file mode 100644 index 0000000000000..285d798cb6a3c --- /dev/null +++ b/lib/tests/lineinfo_kunit.c @@ -0,0 +1,855 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KUnit tests for kallsyms lineinfo (CONFIG_KALLSYMS_LINEINFO). + * + * Copyright (c) 2026 Sasha Levin + * + * Verifies that sprint_symbol() and related APIs append correct + * " (file.c:NNN)" annotations to kernel symbol lookups. + * + * Build with: CONFIG_LINEINFO_KUNIT_TEST=3Dm (or =3Dy) + * Run with: ./tools/testing/kunit/kunit.py run lineinfo + */ + +#include +#include +#include +#include +#include +#include +#include + +/* --------------- helpers --------------- */ + +static char *alloc_sym_buf(struct kunit *test) +{ + return kunit_kzalloc(test, KSYM_SYMBOL_LEN, GFP_KERNEL); +} + +/* + * Format a symbol with lineinfo annotation. Lineinfo is appended only + * via the sprint_backtrace*() entry points (kernel/kallsyms.c only adds + * the "(file:line)" suffix in stack-trace context =E2=80=94 sprint_symbol= () is + * used by %ps and many existing format strings tack literal "()" after + * %ps, where the annotation would render as "foo (file:line)()"). + * + * sprint_backtrace() subtracts 1 from the address to handle tail-call + * return-address corrections; pass @addr + 1 to recover the original. + */ +static int sprint_with_lineinfo(char *buf, unsigned long addr) +{ + return sprint_backtrace(buf, addr + 1); +} + +/* + * Return true if @buf contains a lineinfo annotation matching + * the pattern " (:)". + * + * The path may be a full path like "lib/tests/lineinfo_kunit.c" or + * a shortened form from module lineinfo (e.g., just a directory name). + */ +static bool has_lineinfo(const char *buf) +{ + const char *p, *colon, *end; + + p =3D strstr(buf, " ("); + if (!p) + return false; + p +=3D 2; /* skip " (" */ + + colon =3D strchr(p, ':'); + if (!colon || colon =3D=3D p) + return false; + + /* After colon: one or more digits then ')' */ + end =3D colon + 1; + if (*end < '0' || *end > '9') + return false; + while (*end >=3D '0' && *end <=3D '9') + end++; + return *end =3D=3D ')'; +} + +/* + * Extract line number from a lineinfo annotation. + * Returns 0 if not found. + */ +static unsigned int extract_line(const char *buf) +{ + const char *p, *colon; + unsigned int line =3D 0; + + p =3D strstr(buf, " ("); + if (!p) + return 0; + + colon =3D strchr(p + 2, ':'); + if (!colon) + return 0; + + colon++; + while (*colon >=3D '0' && *colon <=3D '9') { + line =3D line * 10 + (*colon - '0'); + colon++; + } + return line; +} + +/* + * Check if the lineinfo annotation contains the given filename substring. + */ +static bool lineinfo_contains_file(const char *buf, const char *name) +{ + const char *p, *colon; + + p =3D strstr(buf, " ("); + if (!p) + return false; + + colon =3D strchr(p + 2, ':'); + if (!colon) + return false; + + /* Search for @name between '(' and ':' */ + return strnstr(p + 1, name, colon - p - 1) !=3D NULL; +} + +/* --------------- target functions --------------- */ + +static noinline int lineinfo_target_normal(void) +{ + barrier(); + return 42; +} + +static noinline int lineinfo_target_short(void) +{ + barrier(); + return 1; +} + +static noinline int lineinfo_target_with_arg(int x) +{ + barrier(); + return x + 1; +} + +static noinline int lineinfo_target_many_lines(void) +{ + int a =3D 0; + + barrier(); + a +=3D 1; + a +=3D 2; + a +=3D 3; + a +=3D 4; + a +=3D 5; + a +=3D 6; + a +=3D 7; + a +=3D 8; + a +=3D 9; + a +=3D 10; + barrier(); + return a; +} + +static __always_inline int lineinfo_inline_helper(void) +{ + return 99; +} + +static noinline int lineinfo_inline_caller(void) +{ + barrier(); + return lineinfo_inline_helper(); +} + +/* 10-deep call chain */ +static noinline int lineinfo_chain_10(void) { barrier(); return 10; } +static noinline int lineinfo_chain_9(void) { barrier(); return lineinfo_c= hain_10(); } +static noinline int lineinfo_chain_8(void) { barrier(); return lineinfo_c= hain_9(); } +static noinline int lineinfo_chain_7(void) { barrier(); return lineinfo_c= hain_8(); } +static noinline int lineinfo_chain_6(void) { barrier(); return lineinfo_c= hain_7(); } +static noinline int lineinfo_chain_5(void) { barrier(); return lineinfo_c= hain_6(); } +static noinline int lineinfo_chain_4(void) { barrier(); return lineinfo_c= hain_5(); } +static noinline int lineinfo_chain_3(void) { barrier(); return lineinfo_c= hain_4(); } +static noinline int lineinfo_chain_2(void) { barrier(); return lineinfo_c= hain_3(); } +static noinline int lineinfo_chain_1(void) { barrier(); return lineinfo_c= hain_2(); } + +/* --------------- Group A: Basic lineinfo presence --------------- */ + +static void test_normal_function(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo in: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, + lineinfo_contains_file(buf, "lineinfo_kunit.c"), + "Wrong file in: %s", buf); +} + +static void test_static_function(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_short; + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo in: %s", buf); +} + +static void test_noinline_function(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_with_arg; + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo in: %s", buf); +} + +static void test_inline_function(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_inline_caller; + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo for inline caller in: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, + lineinfo_contains_file(buf, "lineinfo_kunit.c"), + "Wrong file in: %s", buf); +} + +static void test_short_function(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_short; + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo for short function in: %s", buf); +} + +static void test_many_lines_function(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_many_lines; + unsigned int line; + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo in: %s", buf); + line =3D extract_line(buf); + KUNIT_EXPECT_GT_MSG(test, line, (unsigned int)0, + "Line number should be > 0 in: %s", buf); +} + +/* --------------- Group B: Deep call chain --------------- */ + +typedef int (*chain_fn_t)(void); + +static void test_deep_call_chain(struct kunit *test) +{ + static const chain_fn_t chain_fns[] =3D { + lineinfo_chain_1, lineinfo_chain_2, + lineinfo_chain_3, lineinfo_chain_4, + lineinfo_chain_5, lineinfo_chain_6, + lineinfo_chain_7, lineinfo_chain_8, + lineinfo_chain_9, lineinfo_chain_10, + }; + char *buf =3D alloc_sym_buf(test); + int i, found =3D 0; + + /* Call chain to prevent dead-code elimination */ + KUNIT_ASSERT_EQ(test, lineinfo_chain_1(), 10); + + for (i =3D 0; i < ARRAY_SIZE(chain_fns); i++) { + unsigned long addr =3D (unsigned long)chain_fns[i]; + + sprint_with_lineinfo(buf, addr); + if (has_lineinfo(buf)) + found++; + } + + /* + * Not every tiny function gets DWARF line info (compiler may + * omit it for very small stubs), but at least some should. + */ + KUNIT_EXPECT_GT_MSG(test, found, 0, + "None of the 10 chain functions had lineinfo"); +} + +/* --------------- Group C: sprint_symbol API variants --------------- */ + +static void test_sprint_symbol_format(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + sprint_symbol(buf, addr); + + /* Should contain +0x and /0x for offset/size */ + KUNIT_EXPECT_NOT_NULL_MSG(test, strstr(buf, "+0x"), + "Missing offset in: %s", buf); + KUNIT_EXPECT_NOT_NULL_MSG(test, strstr(buf, "/0x"), + "Missing size in: %s", buf); + /* + * sprint_symbol() backs %ps, which existing format strings combine + * with literal "()" to indicate function calls; the lineinfo suffix + * is intentionally omitted there to avoid "foo (file:line)()". + */ + KUNIT_EXPECT_FALSE_MSG(test, has_lineinfo(buf), + "Unexpected lineinfo in sprint_symbol output: %s", + buf); +} + +static void test_sprint_backtrace(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + /* sprint_backtrace subtracts 1 internally to handle tail calls */ + sprint_backtrace(buf, addr + 1); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo in backtrace: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, + lineinfo_contains_file(buf, "lineinfo_kunit.c"), + "Wrong file in backtrace: %s", buf); +} + +static void test_sprint_backtrace_build_id(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + sprint_backtrace_build_id(buf, addr + 1); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo in backtrace_build_id: %s", buf); +} + +static void test_sprint_symbol_no_offset(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + sprint_symbol_no_offset(buf, addr); + /* No "+0x" in output */ + KUNIT_EXPECT_NULL_MSG(test, strstr(buf, "+0x"), + "Unexpected offset in no_offset: %s", buf); + /* sprint_symbol_no_offset is a sprint_symbol() variant; lineinfo is + * intentionally only appended in sprint_backtrace*() context. + */ + KUNIT_EXPECT_FALSE_MSG(test, has_lineinfo(buf), + "Unexpected lineinfo in no_offset: %s", buf); +} + +/* --------------- Group D: printk format specifiers --------------- */ + +static void test_pS_format(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + void *addr =3D lineinfo_target_normal; + + snprintf(buf, KSYM_SYMBOL_LEN, "%pS", addr); + /* + * %pS uses sprint_symbol(), which intentionally omits the lineinfo + * suffix (see kernel/kallsyms.c::__sprint_symbol). Lineinfo is only + * added via the sprint_backtrace*() entry points, which back %pBb. + */ + KUNIT_EXPECT_FALSE_MSG(test, has_lineinfo(buf), + "Unexpected lineinfo in %%pS: %s", buf); +} + +static void test_pBb_format(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + /* + * %pBb uses sprint_backtrace_build_id which subtracts 1 from the + * address, so pass addr+1 to resolve back to the function. + */ + void *addr =3D (void *)((unsigned long)lineinfo_target_normal + 1); + + snprintf(buf, KSYM_SYMBOL_LEN, "%pBb", addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo in %%pBb: %s", buf); +} + +static void test_pSR_format(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + void *addr =3D lineinfo_target_normal; + + snprintf(buf, KSYM_SYMBOL_LEN, "%pSR", addr); + /* %pSR is a sprint_symbol() variant; same rationale as %pS. */ + KUNIT_EXPECT_FALSE_MSG(test, has_lineinfo(buf), + "Unexpected lineinfo in %%pSR: %s", buf); +} + +/* --------------- Group E: Address edge cases --------------- */ + +static void test_symbol_start_addr(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + /* + * sprint_backtrace() subtracts 1 from the input and reports offset + * relative to the (decremented) address, so an exact "+0x0/" can't + * be expected here. Verify the symbol resolves and carries lineinfo. + */ + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, + strnstr(buf, "lineinfo_target_normal", + KSYM_SYMBOL_LEN) !=3D NULL, + "Didn't resolve to expected function: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo at function start: %s", buf); +} + +static void test_symbol_nonzero_offset(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + /* + * sprint_backtrace subtracts 1 internally. + * Passing addr+2 resolves to addr+1 which is inside the function + * at a non-zero offset. + */ + sprint_backtrace(buf, addr + 2); + KUNIT_EXPECT_TRUE_MSG(test, + strnstr(buf, "lineinfo_target_normal", + KSYM_SYMBOL_LEN) !=3D NULL, + "Didn't resolve to expected function: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo at non-zero offset: %s", buf); +} + +static void test_unknown_address(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + + sprint_symbol(buf, 1UL); + /* Should be "0x1" with no lineinfo */ + KUNIT_EXPECT_NOT_NULL_MSG(test, strstr(buf, "0x1"), + "Expected hex address for bogus addr: %s", buf); + KUNIT_EXPECT_FALSE_MSG(test, has_lineinfo(buf), + "Unexpected lineinfo for bogus addr: %s", buf); +} + +static void test_kernel_function_lineinfo(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)sprint_symbol; + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo for sprint_symbol: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, + lineinfo_contains_file(buf, "kallsyms.c"), + "Expected kallsyms.c in: %s", buf); +} + +static void test_assembly_no_lineinfo(struct kunit *test) +{ +#if IS_BUILTIN(CONFIG_LINEINFO_KUNIT_TEST) + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)_text; + + sprint_with_lineinfo(buf, addr); + /* + * _text is typically an asm entry point with no DWARF line info. + * If it has lineinfo, it's a C-based entry =E2=80=94 skip in that case. + */ + if (has_lineinfo(buf)) + kunit_skip(test, "_text has lineinfo (C entry?): %s", buf); + + KUNIT_EXPECT_FALSE_MSG(test, has_lineinfo(buf), + "Unexpected lineinfo for asm symbol: %s", buf); +#else + kunit_skip(test, "_text not accessible from modules"); +#endif +} + +/* --------------- Group F: Module path --------------- */ + +static void test_module_function_lineinfo(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + if (!IS_MODULE(CONFIG_LINEINFO_KUNIT_TEST)) { + kunit_skip(test, "Test only meaningful when built as module"); + return; + } + + sprint_with_lineinfo(buf, addr); + KUNIT_EXPECT_NOT_NULL_MSG(test, + strstr(buf, "[lineinfo_kunit"), + "Missing module name in: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, has_lineinfo(buf), + "No lineinfo for module function: %s", buf); + KUNIT_EXPECT_TRUE_MSG(test, + lineinfo_contains_file(buf, "lineinfo_kunit.c"), + "Wrong file for module function: %s", buf); +} + +/* --------------- Group G: Stress --------------- */ + +struct lineinfo_stress_data { + unsigned long addr; + atomic_t failures; +}; + +static void lineinfo_stress_fn(void *info) +{ + struct lineinfo_stress_data *data =3D info; + char buf[KSYM_SYMBOL_LEN]; + int i; + + for (i =3D 0; i < 100; i++) { + sprint_with_lineinfo(buf, data->addr); + if (!has_lineinfo(buf)) + atomic_inc(&data->failures); + } +} + +static void test_concurrent_sprint_symbol(struct kunit *test) +{ + struct lineinfo_stress_data data; + + data.addr =3D (unsigned long)lineinfo_target_normal; + atomic_set(&data.failures, 0); + + on_each_cpu(lineinfo_stress_fn, &data, 1); + + KUNIT_EXPECT_EQ_MSG(test, atomic_read(&data.failures), 0, + "Concurrent lineinfo failures detected"); +} + +static void test_rapid_sprint_symbol(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + int i, failures =3D 0; + + for (i =3D 0; i < 1000; i++) { + sprint_with_lineinfo(buf, addr); + if (!has_lineinfo(buf)) + failures++; + } + + KUNIT_EXPECT_EQ_MSG(test, failures, 0, + "Rapid sprint_symbol failures: %d/1000", failures); +} + +/* --------------- Group H: Safety and plausibility --------------- */ + +static void test_line_number_plausible(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + unsigned int line; + + sprint_with_lineinfo(buf, addr); + KUNIT_ASSERT_TRUE(test, has_lineinfo(buf)); + + line =3D extract_line(buf); + KUNIT_EXPECT_GT_MSG(test, line, (unsigned int)0, + "Line number should be > 0"); + KUNIT_EXPECT_LT_MSG(test, line, (unsigned int)10000, + "Line number %u implausibly large for this file", + line); +} + +static void test_buffer_no_overflow(struct kunit *test) +{ + const size_t canary_size =3D 16; + char *buf; + int i; + + buf =3D kunit_kzalloc(test, KSYM_SYMBOL_LEN + canary_size, GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, buf); + + /* Fill canary area past KSYM_SYMBOL_LEN with 0xAA */ + memset(buf + KSYM_SYMBOL_LEN, 0xAA, canary_size); + + sprint_with_lineinfo(buf, (unsigned long)lineinfo_target_normal); + + /* Verify canary bytes are untouched */ + for (i =3D 0; i < canary_size; i++) { + KUNIT_EXPECT_EQ_MSG(test, + (unsigned char)buf[KSYM_SYMBOL_LEN + i], + (unsigned char)0xAA, + "Buffer overflow at offset %d past KSYM_SYMBOL_LEN", + i); + } +} + +static void test_dump_stack_no_crash(struct kunit *test) +{ + /* Just verify dump_stack() completes without panic */ + dump_stack(); + KUNIT_SUCCEED(test); +} + +static void test_sprint_symbol_build_id(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + + sprint_symbol_build_id(buf, addr); + /* Lineinfo is appended only via sprint_backtrace*(); the symbol + * variants intentionally omit it to avoid clashing with format + * strings that already wrap %ps in literal "()". + */ + KUNIT_EXPECT_FALSE_MSG(test, has_lineinfo(buf), + "Unexpected lineinfo in sprint_symbol_build_id: %s", + buf); +} + +static void test_sleb128_edge_cases(struct kunit *test) +{ + u32 pos; + int32_t result; + + /* Value 0: single byte 0x00 */ + { + static const u8 data[] =3D { 0x00 }; + + pos =3D 0; + result =3D lineinfo_read_sleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (int32_t)0); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } + + /* Value -1: single byte 0x7F */ + { + static const u8 data[] =3D { 0x7f }; + + pos =3D 0; + result =3D lineinfo_read_sleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (int32_t)-1); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } + + /* Value 1: single byte 0x01 */ + { + static const u8 data[] =3D { 0x01 }; + + pos =3D 0; + result =3D lineinfo_read_sleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (int32_t)1); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } + + /* Value -64: single byte 0x40 */ + { + static const u8 data[] =3D { 0x40 }; + + pos =3D 0; + result =3D lineinfo_read_sleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (int32_t)-64); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } + + /* Value 63: single byte 0x3F */ + { + static const u8 data[] =3D { 0x3f }; + + pos =3D 0; + result =3D lineinfo_read_sleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (int32_t)63); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } + + /* Value -128: two bytes 0x80 0x7F */ + { + static const u8 data[] =3D { 0x80, 0x7f }; + + pos =3D 0; + result =3D lineinfo_read_sleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (int32_t)-128); + KUNIT_EXPECT_EQ(test, pos, (u32)2); + } +} + +static void test_uleb128_edge_cases(struct kunit *test) +{ + u32 pos, result; + + /* Value 0: single byte 0x00 */ + { + static const u8 data[] =3D { 0x00 }; + + pos =3D 0; + result =3D lineinfo_read_uleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (u32)0); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } + + /* Value 127: single byte 0x7F */ + { + static const u8 data[] =3D { 0x7F }; + + pos =3D 0; + result =3D lineinfo_read_uleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (u32)127); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } + + /* Value 128: two bytes 0x80 0x01 */ + { + static const u8 data[] =3D { 0x80, 0x01 }; + + pos =3D 0; + result =3D lineinfo_read_uleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (u32)128); + KUNIT_EXPECT_EQ(test, pos, (u32)2); + } + + /* Max u32 0xFFFFFFFF: 5 bytes */ + { + static const u8 data[] =3D { 0xFF, 0xFF, 0xFF, 0xFF, 0x0F }; + + pos =3D 0; + result =3D lineinfo_read_uleb128(data, &pos, sizeof(data)); + KUNIT_EXPECT_EQ(test, result, (u32)0xFFFFFFFF); + KUNIT_EXPECT_EQ(test, pos, (u32)5); + } + + /* Truncated input: pos >=3D end returns 0 */ + { + static const u8 data[] =3D { 0x80 }; + + pos =3D 0; + result =3D lineinfo_read_uleb128(data, &pos, 0); + KUNIT_EXPECT_EQ_MSG(test, result, (u32)0, + "Expected 0 for empty input"); + } + + /* Truncated mid-varint: continuation byte but end reached */ + { + static const u8 data[] =3D { 0x80 }; + + pos =3D 0; + result =3D lineinfo_read_uleb128(data, &pos, 1); + KUNIT_EXPECT_EQ_MSG(test, result, (u32)0, + "Expected 0 for truncated varint"); + KUNIT_EXPECT_EQ(test, pos, (u32)1); + } +} + +static void test_line_number_accuracy(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_normal; + unsigned int line; + + sprint_with_lineinfo(buf, addr); + KUNIT_ASSERT_TRUE(test, has_lineinfo(buf)); + + line =3D extract_line(buf); + + /* + * lineinfo_target_normal is defined around line 103-107. + * Allow wide range: KASAN instrumentation and module lineinfo + * address mapping can shift the reported line significantly. + */ + KUNIT_EXPECT_GE_MSG(test, line, (unsigned int)50, + "Line %u too low for lineinfo_target_normal", line); + KUNIT_EXPECT_LE_MSG(test, line, (unsigned int)300, + "Line %u too high for lineinfo_target_normal", line); +} + +static void test_many_lines_mid_function(struct kunit *test) +{ + char *buf =3D alloc_sym_buf(test); + unsigned long addr =3D (unsigned long)lineinfo_target_many_lines; + unsigned int line; + unsigned long mid_addr; + + /* Get function size from sprint_with_lineinfo output */ + sprint_with_lineinfo(buf, addr); + KUNIT_ASSERT_TRUE(test, has_lineinfo(buf)); + + /* Try an address 8 bytes into the function (past prologue) */ + mid_addr =3D addr + 8; + sprint_with_lineinfo(buf, mid_addr); + + /* + * Should still resolve to lineinfo_target_many_lines. + * Lineinfo should be present with a plausible line number. + */ + KUNIT_EXPECT_TRUE_MSG(test, + strnstr(buf, "lineinfo_target_many_lines", + KSYM_SYMBOL_LEN) !=3D NULL, + "Mid-function addr resolved to wrong symbol: %s", + buf); + if (has_lineinfo(buf)) { + line =3D extract_line(buf); + KUNIT_EXPECT_GE_MSG(test, line, (unsigned int)50, + "Line %u too low for mid-function", line); + KUNIT_EXPECT_LE_MSG(test, line, (unsigned int)700, + "Line %u too high for mid-function", line); + } +} + +/* --------------- Suite registration --------------- */ + +static struct kunit_case lineinfo_test_cases[] =3D { + /* Group A: Basic lineinfo presence */ + KUNIT_CASE(test_normal_function), + KUNIT_CASE(test_static_function), + KUNIT_CASE(test_noinline_function), + KUNIT_CASE(test_inline_function), + KUNIT_CASE(test_short_function), + KUNIT_CASE(test_many_lines_function), + /* Group B: Deep call chain */ + KUNIT_CASE(test_deep_call_chain), + /* Group C: sprint_symbol API variants */ + KUNIT_CASE(test_sprint_symbol_format), + KUNIT_CASE(test_sprint_backtrace), + KUNIT_CASE(test_sprint_backtrace_build_id), + KUNIT_CASE(test_sprint_symbol_no_offset), + /* Group D: printk format specifiers */ + KUNIT_CASE(test_pS_format), + KUNIT_CASE(test_pBb_format), + KUNIT_CASE(test_pSR_format), + /* Group E: Address edge cases */ + KUNIT_CASE(test_symbol_start_addr), + KUNIT_CASE(test_symbol_nonzero_offset), + KUNIT_CASE(test_unknown_address), + KUNIT_CASE(test_kernel_function_lineinfo), + KUNIT_CASE(test_assembly_no_lineinfo), + /* Group F: Module path */ + KUNIT_CASE(test_module_function_lineinfo), + /* Group G: Stress */ + KUNIT_CASE_SLOW(test_concurrent_sprint_symbol), + KUNIT_CASE_SLOW(test_rapid_sprint_symbol), + /* Group H: Safety and plausibility */ + KUNIT_CASE(test_line_number_plausible), + KUNIT_CASE(test_buffer_no_overflow), + KUNIT_CASE(test_dump_stack_no_crash), + KUNIT_CASE(test_sprint_symbol_build_id), + /* Group I: Encoding/decoding and accuracy */ + KUNIT_CASE(test_sleb128_edge_cases), + KUNIT_CASE(test_uleb128_edge_cases), + KUNIT_CASE(test_line_number_accuracy), + KUNIT_CASE(test_many_lines_mid_function), + {} +}; + +static struct kunit_suite lineinfo_test_suite =3D { + .name =3D "lineinfo", + .test_cases =3D lineinfo_test_cases, +}; +kunit_test_suites(&lineinfo_test_suite); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("KUnit tests for kallsyms lineinfo"); +MODULE_AUTHOR("Sasha Levin"); --=20 2.53.0