kernel/module.c | 26 +++++++++++++++++++++++--- 1 file changed, 23 insertions(+), 3 deletions(-)
If kernel modules are stripped off symbols for some reason (e.g. built by
using strip --strip-unneeded option) then stack traces in dmesg do not show
symbol name for address. It just prints absolute address sometimes (if
there is no good match with any symbol)
e.g.
[245864.699580] do_nmi+0x12f/0x370
[245864.699583] end_repeat_nmi+0x16/0x50
[245864.699585] RIP: 0010:0xffffffffc06b67ec <<<<<<<<
[245864.699585] RSP: 0000:ffffaaa540cffe48 EFLAGS: 00000097
[245864.699586] RAX: 0000000000000001 RBX: ffff93357a729000 RCX: 0000000000000001
[245864.699587] RDX: ffff93357a729050 RSI: 0000000000000000 RDI: ffff93357a729000
[245864.699588] RBP: ffff9335cf521300 R08: 0000000000000001 R09: 0000000000000004
[245864.699588] R10: ffffaaa545b23ed0 R11: 0000000000000001 R12: ffffffffc06b61a0
[245864.699589] R13: ffffaaa540cffe60 R14: ffff9335c77fa3c0 R15: ffff9335cf51d7c0
[245864.699590] ? 0xffffffffc06b61a0
[245864.699592] ? 0xffffffffc06b67ec <<<<<<<<
[245864.699593] ? 0xffffffffc06b67ec
[245864.699594] </NMI>
Note RIP: 0010:0xffffffffc06b67ec and 0xffffffffc06b67ec printed in above
stack trace as absolute address.
There is no easy way in case box crashes as we loose information on load
address of specific module. This changes the symbol decoding (in kernel/
module.c) such that it can print offset from start of section (.text or
.init.text) in case there is no good match with any symbol.
It will now decode address in such cases to [module]+ offset/size or
[module __init]+offset/size depending on where the address lies (in
core/.text or init/.init.text section of module).
One can use objdump/readelf/nm to find symbols with offset from
.init.text and .text sections.
steps to reproduce the problem:
-------------------------------
1. Add WARN_ON_ONCE(1) in module e.g. test_module.c
2. Build and strip the module using --strip-unneeded option
3. Load the module and check RIP in dmesg
tests done:
-----------
1. Added WARN_ON_ONE(1) in functions of a module for testing
-------------------------------------------------------------
[ 407.934085] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
[ 407.934087] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 407.934088] RIP: 0010:[module __init]+0x4/0x7 [test_module]
[ 407.934097] Code: Unable to access opcode bytes at RIP 0xffffffffc07edfda.
[ 407.934098] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
[ 407.934100] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 407.934101] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
[ 407.934102] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
[ 407.934103] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
[ 407.934104] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
[ 407.934105] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
[ 407.934107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 407.934108] CR2: ffffffffc07edfda CR3: 00000000063ea006 CR4: 00000000000706f0
[ 407.934113] Call Trace:
[ 407.934114] <TASK>
[ 407.934116] ? init_module+0x55/0xff9 [test_module]
...
[ 407.934232] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
[ 407.934234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 407.934242] RIP: 0010:[module]+0x4/0x7 [test_module]
[ 407.934248] Code: Unable to access opcode bytes at RIP 0xffffffffc07e1fda.
[ 407.934249] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
[ 407.934251] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 407.934252] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
[ 407.934253] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
[ 407.934254] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
[ 407.934255] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
[ 407.934256] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
[ 407.934257] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 407.934258] CR2: ffffffffc07e1fda CR3: 00000000063ea006 CR4: 00000000000706f0
[ 407.934260] Call Trace:
[ 407.934260] <TASK>
[ 407.934261] ? init_module+0x5a/0xff9 [test_module]
note that it is able to decode RIP to an offset from module start or
init start now.
tested on linux->next (tag next-20211220)
Signed-off-by: Vimal Agrawal <vimal.agrawal@sophos.com>
---
kernel/module.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c
index 24dab046e16c..4de15c06e760 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4270,14 +4270,21 @@ static const char *find_kallsyms_symbol(struct module *mod,
unsigned long *offset)
{
unsigned int i, best = 0;
- unsigned long nextval, bestval;
+ unsigned long baseval, nextval, bestval;
struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
+ char *module_base_name;
/* At worse, next value is at end of module */
- if (within_module_init(addr, mod))
+ if (within_module_init(addr, mod)) {
+ baseval = (unsigned long)mod->init_layout.base;
nextval = (unsigned long)mod->init_layout.base+mod->init_layout.text_size;
- else
+ module_base_name = "[module __init]";
+
+ } else {
+ baseval = (unsigned long)mod->core_layout.base;
nextval = (unsigned long)mod->core_layout.base+mod->core_layout.text_size;
+ module_base_name = "[module]";
+ }
bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);
@@ -4308,6 +4315,19 @@ static const char *find_kallsyms_symbol(struct module *mod,
nextval = thisval;
}
+ if ((is_module_text_address(addr) &&
+ (bestval < baseval || bestval > nextval))) {
+ /*
+ * return MODULE base and offset if we could not find
+ * any best match for text address
+ */
+ if (size)
+ *size = nextval - baseval;
+ if (offset)
+ *offset = addr - baseval;
+ return module_base_name;
+ }
+
if (!best)
return NULL;
--
2.32.0
The subject should only be about 70-75 characters. You're going
way overboard here..
Try:
modules: add heuristic when stripping unneeded symbols
Then please describe in your commit log that on OpenWrt the
INSTALL_MOD_STRIP="--strip-unneeded" at kernel/module install
time, and so modules are stripped of unneeded symbols.
Your commit log currently does not give us the idea that this
is in fact a useful feature to consider, mentioning that this
is in fact used in OpenWrt tries to help us get a better idea
of a possible user of it and a potential benefit.
On Mon, Jan 17, 2022 at 12:24:52PM +0530, Vimal Agrawal wrote:
> If kernel modules are stripped off symbols for some reason (e.g. built by
> using strip --strip-unneeded option) then stack traces in dmesg do not show
> symbol name for address. It just prints absolute address sometimes (if
> there is no good match with any symbol)
> e.g.
>
> [245864.699580] do_nmi+0x12f/0x370
> [245864.699583] end_repeat_nmi+0x16/0x50
> [245864.699585] RIP: 0010:0xffffffffc06b67ec <<<<<<<<
> [245864.699585] RSP: 0000:ffffaaa540cffe48 EFLAGS: 00000097
> [245864.699586] RAX: 0000000000000001 RBX: ffff93357a729000 RCX: 0000000000000001
> [245864.699587] RDX: ffff93357a729050 RSI: 0000000000000000 RDI: ffff93357a729000
> [245864.699588] RBP: ffff9335cf521300 R08: 0000000000000001 R09: 0000000000000004
> [245864.699588] R10: ffffaaa545b23ed0 R11: 0000000000000001 R12: ffffffffc06b61a0
> [245864.699589] R13: ffffaaa540cffe60 R14: ffff9335c77fa3c0 R15: ffff9335cf51d7c0
> [245864.699590] ? 0xffffffffc06b61a0
> [245864.699592] ? 0xffffffffc06b67ec <<<<<<<<
> [245864.699593] ? 0xffffffffc06b67ec
> [245864.699594] </NMI>
>
> Note RIP: 0010:0xffffffffc06b67ec and 0xffffffffc06b67ec printed in above
> stack trace as absolute address.
> There is no easy way in case box crashes as we loose information on load
> address of specific module. This changes the symbol decoding (in kernel/
> module.c) such that it can print offset from start of section (.text or
> .init.text) in case there is no good match with any symbol.
>
> It will now decode address in such cases to [module]+ offset/size or
> [module __init]+offset/size depending on where the address lies (in
> core/.text or init/.init.text section of module).
>
> One can use objdump/readelf/nm to find symbols with offset from
> .init.text and .text sections.
>
> steps to reproduce the problem:
> -------------------------------
> 1. Add WARN_ON_ONCE(1) in module e.g. test_module.c
> 2. Build and strip the module using --strip-unneeded option
> 3. Load the module and check RIP in dmesg
>
> tests done:
> -----------
> 1. Added WARN_ON_ONE(1) in functions of a module for testing
> -------------------------------------------------------------
> [ 407.934085] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
> [ 407.934087] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [ 407.934088] RIP: 0010:[module __init]+0x4/0x7 [test_module]
> [ 407.934097] Code: Unable to access opcode bytes at RIP 0xffffffffc07edfda.
> [ 407.934098] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
> [ 407.934100] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 407.934101] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
> [ 407.934102] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
> [ 407.934103] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
> [ 407.934104] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
> [ 407.934105] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
> [ 407.934107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 407.934108] CR2: ffffffffc07edfda CR3: 00000000063ea006 CR4: 00000000000706f0
> [ 407.934113] Call Trace:
> [ 407.934114] <TASK>
> [ 407.934116] ? init_module+0x55/0xff9 [test_module]
> ...
> [ 407.934232] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
> [ 407.934234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [ 407.934242] RIP: 0010:[module]+0x4/0x7 [test_module]
> [ 407.934248] Code: Unable to access opcode bytes at RIP 0xffffffffc07e1fda.
> [ 407.934249] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
> [ 407.934251] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 407.934252] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
> [ 407.934253] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
> [ 407.934254] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
> [ 407.934255] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
> [ 407.934256] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
> [ 407.934257] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 407.934258] CR2: ffffffffc07e1fda CR3: 00000000063ea006 CR4: 00000000000706f0
> [ 407.934260] Call Trace:
> [ 407.934260] <TASK>
> [ 407.934261] ? init_module+0x5a/0xff9 [test_module]
>
> note that it is able to decode RIP to an offset from module start or
> init start now.
>
> tested on linux->next (tag next-20211220)
Sorry but for some reason this is not working on my end, but then
again I'm not able to get the stripped out results you see either.
So it could be I just have too many debugging options enabled
that makes INSTALL_MOD_STRIP="--strip-unneeded" not really do
much. I don't know if that is possible but I will have to try
to reduce my build options to test.
> Signed-off-by: Vimal Agrawal <vimal.agrawal@sophos.com>
> ---
> kernel/module.c | 26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index 24dab046e16c..4de15c06e760 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -4270,14 +4270,21 @@ static const char *find_kallsyms_symbol(struct module *mod,
> unsigned long *offset)
> {
> unsigned int i, best = 0;
> - unsigned long nextval, bestval;
> + unsigned long baseval, nextval, bestval;
> struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
> + char *module_base_name;
>
> /* At worse, next value is at end of module */
> - if (within_module_init(addr, mod))
> + if (within_module_init(addr, mod)) {
> + baseval = (unsigned long)mod->init_layout.base;
> nextval = (unsigned long)mod->init_layout.base+mod->init_layout.text_size;
> - else
> + module_base_name = "[module __init]";
> +
> + } else {
> + baseval = (unsigned long)mod->core_layout.base;
> nextval = (unsigned long)mod->core_layout.base+mod->core_layout.text_size;
> + module_base_name = "[module]";
> + }
>
> bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);
>
> @@ -4308,6 +4315,19 @@ static const char *find_kallsyms_symbol(struct module *mod,
> nextval = thisval;
> }
>
> + if ((is_module_text_address(addr) &&
> + (bestval < baseval || bestval > nextval))) {
> + /*
> + * return MODULE base and offset if we could not find
> + * any best match for text address
> + */
> + if (size)
> + *size = nextval - baseval;
> + if (offset)
> + *offset = addr - baseval;
> + return module_base_name;
> + }
> +
I don't like this code to go before the check for !best. You did not
have this in your v1 or v2 patch. Why did you change it?
Luis
> if (!best)
> return NULL;
>
> --
> 2.32.0
>
Hi Luis,
> Try:
>
> modules: add heuristic when stripping unneeded symbols
>
> Then please describe in your commit log that on OpenWrt the
> INSTALL_MOD_STRIP="--strip-unneeded" at kernel/module install
> time, and so modules are stripped of unneeded symbols.
Yes, Thanks, I will send the updated patch shortly.
> Sorry but for some reason this is not working on my end, but then
> again I'm not able to get the stripped out results you see either.
> So it could be I just have too many debugging options enabled
> that makes INSTALL_MOD_STRIP="--strip-unneeded" not really do
> much. I don't know if that is possible but I will have to try
> to reduce my build options to test.
Yes, there could be some dependencies due to other debug options and
it may not be able to strip it to a good extent.
Are you trying with changes in test_module.c?
I was able to reproduce it easily on ubuntu 21.10 with following patch:
diff --git a/lib/test_module.c b/lib/test_module.c
index debd19e35198..53578e7a34d7 100644
--- a/lib/test_module.c
+++ b/lib/test_module.c
@@ -14,10 +14,25 @@
#include <linux/module.h>
#include <linux/printk.h>
+static void test_module_warn_start(int x)
+{
+ if (x) WARN_ON_ONCE(1);
+}
+
+static void __init test_module_warn_init(int x)
+{
+ if (x) WARN_ON_ONCE(1);
+}
+
+
static int __init test_module_init(void)
{
pr_warn("Hello, world\n");
+ printk("address of test_module_warn_int is %px\n",
test_module_warn_init);
+ printk("address of test_module_warn_start is %px\n",
test_module_warn_start);
+ test_module_warn_init(1);
+ test_module_warn_start(1);
return 0;
}
After building this module, I used the following command to strip it:
> strip --strip-unneeded test_module.ko
and then loaded the stripped .ko using insmod.
Vimal
If kernel modules are stripped off symbols for some reason then stack
traces in dmesg do not show symbol name for address. It just prints
absolute address sometimes (if there is no good match with any symbol)
This was seen with OpenWrt which uses option INSTALL_MOD_STRIP=
"--strip-unneeded" at kernel/module build/install time, and so modules
are stripped off unneeded symbols.
[245864.699580] do_nmi+0x12f/0x370
[245864.699583] end_repeat_nmi+0x16/0x50
[245864.699585] RIP: 0010:0xffffffffc06b67ec <<<<<<<<
[245864.699585] RSP: 0000:ffffaaa540cffe48 EFLAGS: 00000097
[245864.699586] RAX: 0000000000000001 RBX: ffff93357a729000 RCX: 0000000000000001
[245864.699587] RDX: ffff93357a729050 RSI: 0000000000000000 RDI: ffff93357a729000
[245864.699588] RBP: ffff9335cf521300 R08: 0000000000000001 R09: 0000000000000004
[245864.699588] R10: ffffaaa545b23ed0 R11: 0000000000000001 R12: ffffffffc06b61a0
[245864.699589] R13: ffffaaa540cffe60 R14: ffff9335c77fa3c0 R15: ffff9335cf51d7c0
[245864.699590] ? 0xffffffffc06b61a0
[245864.699592] ? 0xffffffffc06b67ec <<<<<<<<
[245864.699593] ? 0xffffffffc06b67ec
[245864.699594] </NMI>
Note RIP: 0010:0xffffffffc06b67ec and 0xffffffffc06b67ec printed in above
stack trace as absolute address. There is no easy way in case box crashes
as we loose information on load address of specific module.
This changes the symbol decoding (in kernel/module.c) such that it can
print offset from start of section (.text or .init.text) in case there
is no good match with any symbol.
It will now decode address in such cases to [module]+ offset/size or
[module __init]+offset/size depending on where the address lies (in
core/.text or init/.init.text section of module).
One can use objdump/readelf/nm to find symbols with offset from .init.text
and .text sections.
steps to reproduce the problem:
-------------------------------
1. Add WARN_ON_ONCE(1) in module e.g. test_module.c
2. Build and strip the module using --strip-unneeded option
3. Load the module and check RIP in dmesg
tests done:
-----------
1. Added WARN_ON_ONE(1) in functions of a module for testing
-------------------------------------------------------------
[ 407.934085] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
[ 407.934087] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 407.934088] RIP: 0010:[module __init]+0x4/0x7 [test_module]
[ 407.934097] Code: Unable to access opcode bytes at RIP 0xffffffffc07edfda.
[ 407.934098] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
[ 407.934100] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 407.934101] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
[ 407.934102] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
[ 407.934103] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
[ 407.934104] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
[ 407.934105] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
[ 407.934107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 407.934108] CR2: ffffffffc07edfda CR3: 00000000063ea006 CR4: 00000000000706f0
[ 407.934113] Call Trace:
[ 407.934114] <TASK>
[ 407.934116] ? init_module+0x55/0xff9 [test_module]
...
[ 407.934232] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
[ 407.934234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 407.934242] RIP: 0010:[module]+0x4/0x7 [test_module]
[ 407.934248] Code: Unable to access opcode bytes at RIP 0xffffffffc07e1fda.
[ 407.934249] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
[ 407.934251] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 407.934252] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
[ 407.934253] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
[ 407.934254] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
[ 407.934255] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
[ 407.934256] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
[ 407.934257] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 407.934258] CR2: ffffffffc07e1fda CR3: 00000000063ea006 CR4: 00000000000706f0
[ 407.934260] Call Trace:
[ 407.934260] <TASK>
[ 407.934261] ? init_module+0x5a/0xff9 [test_module]
note that it is able to decode RIP to an offset from module start or
init start now.
tested on linux->next (tag next-20211220)
Signed-off-by: Vimal Agrawal <vimal.agrawal@sophos.com>
---
kernel/module.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c
index 24dab046e16c..4de15c06e760 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4270,14 +4270,21 @@ static const char *find_kallsyms_symbol(struct module *mod,
unsigned long *offset)
{
unsigned int i, best = 0;
- unsigned long nextval, bestval;
+ unsigned long baseval, nextval, bestval;
struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
+ char *module_base_name;
/* At worse, next value is at end of module */
- if (within_module_init(addr, mod))
+ if (within_module_init(addr, mod)) {
+ baseval = (unsigned long)mod->init_layout.base;
nextval = (unsigned long)mod->init_layout.base+mod->init_layout.text_size;
- else
+ module_base_name = "[module __init]";
+
+ } else {
+ baseval = (unsigned long)mod->core_layout.base;
nextval = (unsigned long)mod->core_layout.base+mod->core_layout.text_size;
+ module_base_name = "[module]";
+ }
bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);
@@ -4308,6 +4315,19 @@ static const char *find_kallsyms_symbol(struct module *mod,
nextval = thisval;
}
+ if ((is_module_text_address(addr) &&
+ (bestval < baseval || bestval > nextval))) {
+ /*
+ * return MODULE base and offset if we could not find
+ * any best match for text address
+ */
+ if (size)
+ *size = nextval - baseval;
+ if (offset)
+ *offset = addr - baseval;
+ return module_base_name;
+ }
+
if (!best)
return NULL;
--
2.32.0
If kernel modules are stripped off symbols for some reason then stack
traces in dmesg do not show symbol name for address. It just prints
absolute address sometimes (if there is no good match with any symbol)
This was seen with OpenWrt which uses option INSTALL_MOD_STRIP=
"--strip-unneeded" at kernel/module build/install time, and so modules
are stripped off unneeded symbols.
[245864.699580] do_nmi+0x12f/0x370
[245864.699583] end_repeat_nmi+0x16/0x50
[245864.699585] RIP: 0010:0xffffffffc06b67ec <<<<<<<<
[245864.699585] RSP: 0000:ffffaaa540cffe48 EFLAGS: 00000097
[245864.699586] RAX: 0000000000000001 RBX: ffff93357a729000 RCX: 0000000000000001
[245864.699587] RDX: ffff93357a729050 RSI: 0000000000000000 RDI: ffff93357a729000
[245864.699588] RBP: ffff9335cf521300 R08: 0000000000000001 R09: 0000000000000004
[245864.699588] R10: ffffaaa545b23ed0 R11: 0000000000000001 R12: ffffffffc06b61a0
[245864.699589] R13: ffffaaa540cffe60 R14: ffff9335c77fa3c0 R15: ffff9335cf51d7c0
[245864.699590] ? 0xffffffffc06b61a0
[245864.699592] ? 0xffffffffc06b67ec <<<<<<<<
[245864.699593] ? 0xffffffffc06b67ec
[245864.699594] </NMI>
Note RIP: 0010:0xffffffffc06b67ec and 0xffffffffc06b67ec printed in above
stack trace as absolute address. There is no easy way in case box crashes
as we loose information on load address of specific module.
This changes the symbol decoding (in kernel/module.c) such that it can
print offset from start of section (.text or .init.text) in case there
is no good match with any symbol.
It will now decode address in such cases to [module]+ offset/size or
[module __init]+offset/size depending on where the address lies (in
core/.text or init/.init.text section of module).
One can use objdump/readelf/nm to find symbols with offset from .init.text
and .text sections.
steps to reproduce the problem:
-------------------------------
1. Add WARN_ON_ONCE(1) in module e.g. test_module.c
2. Build and strip the module using --strip-unneeded option
3. Load the module and check RIP in dmesg
tests done:
-----------
1. Added WARN_ON_ONE(1) in functions of a module for testing
-------------------------------------------------------------
[ 407.934085] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
[ 407.934087] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 407.934088] RIP: 0010:[module __init]+0x4/0x7 [test_module]
[ 407.934097] Code: Unable to access opcode bytes at RIP 0xffffffffc07edfda.
[ 407.934098] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
[ 407.934100] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 407.934101] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
[ 407.934102] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
[ 407.934103] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
[ 407.934104] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
[ 407.934105] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
[ 407.934107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 407.934108] CR2: ffffffffc07edfda CR3: 00000000063ea006 CR4: 00000000000706f0
[ 407.934113] Call Trace:
[ 407.934114] <TASK>
[ 407.934116] ? init_module+0x55/0xff9 [test_module]
...
[ 407.934232] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
[ 407.934234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 407.934242] RIP: 0010:[module]+0x4/0x7 [test_module]
[ 407.934248] Code: Unable to access opcode bytes at RIP 0xffffffffc07e1fda.
[ 407.934249] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
[ 407.934251] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 407.934252] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
[ 407.934253] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
[ 407.934254] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
[ 407.934255] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
[ 407.934256] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
[ 407.934257] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 407.934258] CR2: ffffffffc07e1fda CR3: 00000000063ea006 CR4: 00000000000706f0
[ 407.934260] Call Trace:
[ 407.934260] <TASK>
[ 407.934261] ? init_module+0x5a/0xff9 [test_module]
note that it is able to decode RIP to an offset from module start or
init start now.
tested on linux->next (tag next-20211220)
Signed-off-by: Vimal Agrawal <vimal.agrawal@sophos.com>
Acked-by: Nishit Shah <nishit.shah@sophos.com>
Suggested-by: Dirk VanDerMerwe <Dirk.VanDerMerwe@Sophos.com>
---
kernel/module.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c
index 24dab046e16c..4de15c06e760 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4270,14 +4270,21 @@ static const char *find_kallsyms_symbol(struct module *mod,
unsigned long *offset)
{
unsigned int i, best = 0;
- unsigned long nextval, bestval;
+ unsigned long baseval, nextval, bestval;
struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
+ char *module_base_name;
/* At worse, next value is at end of module */
- if (within_module_init(addr, mod))
+ if (within_module_init(addr, mod)) {
+ baseval = (unsigned long)mod->init_layout.base;
nextval = (unsigned long)mod->init_layout.base+mod->init_layout.text_size;
- else
+ module_base_name = "[module __init]";
+
+ } else {
+ baseval = (unsigned long)mod->core_layout.base;
nextval = (unsigned long)mod->core_layout.base+mod->core_layout.text_size;
+ module_base_name = "[module]";
+ }
bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);
@@ -4308,6 +4315,19 @@ static const char *find_kallsyms_symbol(struct module *mod,
nextval = thisval;
}
+ if ((is_module_text_address(addr) &&
+ (bestval < baseval || bestval > nextval))) {
+ /*
+ * return MODULE base and offset if we could not find
+ * any best match for text address
+ */
+ if (size)
+ *size = nextval - baseval;
+ if (offset)
+ *offset = addr - baseval;
+ return module_base_name;
+ }
+
if (!best)
return NULL;
--
2.32.0
On Fri, Feb 04, 2022 at 02:09:32PM +0530, Vimal Agrawal wrote:
> If kernel modules are stripped off symbols for some reason then stack
> traces in dmesg do not show symbol name for address. It just prints
> absolute address sometimes (if there is no good match with any symbol)
>
> This was seen with OpenWrt which uses option INSTALL_MOD_STRIP=
> "--strip-unneeded" at kernel/module build/install time, and so modules
> are stripped off unneeded symbols.
>
> [245864.699580] do_nmi+0x12f/0x370
> [245864.699583] end_repeat_nmi+0x16/0x50
> [245864.699585] RIP: 0010:0xffffffffc06b67ec <<<<<<<<
> [245864.699585] RSP: 0000:ffffaaa540cffe48 EFLAGS: 00000097
> [245864.699586] RAX: 0000000000000001 RBX: ffff93357a729000 RCX: 0000000000000001
> [245864.699587] RDX: ffff93357a729050 RSI: 0000000000000000 RDI: ffff93357a729000
> [245864.699588] RBP: ffff9335cf521300 R08: 0000000000000001 R09: 0000000000000004
> [245864.699588] R10: ffffaaa545b23ed0 R11: 0000000000000001 R12: ffffffffc06b61a0
> [245864.699589] R13: ffffaaa540cffe60 R14: ffff9335c77fa3c0 R15: ffff9335cf51d7c0
> [245864.699590] ? 0xffffffffc06b61a0
> [245864.699592] ? 0xffffffffc06b67ec <<<<<<<<
> [245864.699593] ? 0xffffffffc06b67ec
> [245864.699594] </NMI>
>
> Note RIP: 0010:0xffffffffc06b67ec and 0xffffffffc06b67ec printed in above
> stack trace as absolute address. There is no easy way in case box crashes
> as we loose information on load address of specific module.
>
> This changes the symbol decoding (in kernel/module.c) such that it can
> print offset from start of section (.text or .init.text) in case there
> is no good match with any symbol.
>
> It will now decode address in such cases to [module]+ offset/size or
> [module __init]+offset/size depending on where the address lies (in
> core/.text or init/.init.text section of module).
>
> One can use objdump/readelf/nm to find symbols with offset from .init.text
> and .text sections.
>
> steps to reproduce the problem:
> -------------------------------
> 1. Add WARN_ON_ONCE(1) in module e.g. test_module.c
> 2. Build and strip the module using --strip-unneeded option
> 3. Load the module and check RIP in dmesg
>
> tests done:
> -----------
> 1. Added WARN_ON_ONE(1) in functions of a module for testing
> -------------------------------------------------------------
> [ 407.934085] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
> [ 407.934087] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [ 407.934088] RIP: 0010:[module __init]+0x4/0x7 [test_module]
> [ 407.934097] Code: Unable to access opcode bytes at RIP 0xffffffffc07edfda.
> [ 407.934098] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
> [ 407.934100] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 407.934101] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
> [ 407.934102] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
> [ 407.934103] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
> [ 407.934104] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
> [ 407.934105] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
> [ 407.934107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 407.934108] CR2: ffffffffc07edfda CR3: 00000000063ea006 CR4: 00000000000706f0
> [ 407.934113] Call Trace:
> [ 407.934114] <TASK>
> [ 407.934116] ? init_module+0x55/0xff9 [test_module]
> ...
> [ 407.934232] CPU: 0 PID: 2956 Comm: insmod Tainted: G W E 5.16.0-rc5-next-20211220+ #2
> [ 407.934234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [ 407.934242] RIP: 0010:[module]+0x4/0x7 [test_module]
> [ 407.934248] Code: Unable to access opcode bytes at RIP 0xffffffffc07e1fda.
> [ 407.934249] RSP: 0018:ffffb21440487c20 EFLAGS: 00010202
> [ 407.934251] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 407.934252] RDX: 0000000000000000 RSI: ffffffff9c38e5e1 RDI: 0000000000000001
> [ 407.934253] RBP: ffffb21440487c28 R08: 0000000000000000 R09: ffffb21440487a20
> [ 407.934254] R10: ffffb21440487a18 R11: ffffffff9c755248 R12: ffffffffc07ee007
> [ 407.934255] R13: ffff92a0f1e260b0 R14: 0000000000000000 R15: 0000000000000000
> [ 407.934256] FS: 00007f578ebc4400(0000) GS:ffff92a1c0e00000(0000) knlGS:0000000000000000
> [ 407.934257] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 407.934258] CR2: ffffffffc07e1fda CR3: 00000000063ea006 CR4: 00000000000706f0
> [ 407.934260] Call Trace:
> [ 407.934260] <TASK>
> [ 407.934261] ? init_module+0x5a/0xff9 [test_module]
>
> note that it is able to decode RIP to an offset from module start or
> init start now.
>
> tested on linux->next (tag next-20211220)
>
> Signed-off-by: Vimal Agrawal <vimal.agrawal@sophos.com>
> Acked-by: Nishit Shah <nishit.shah@sophos.com>
> Suggested-by: Dirk VanDerMerwe <Dirk.VanDerMerwe@Sophos.com>
> ---
> kernel/module.c | 26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index 24dab046e16c..4de15c06e760 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -4270,14 +4270,21 @@ static const char *find_kallsyms_symbol(struct module *mod,
> unsigned long *offset)
> {
> unsigned int i, best = 0;
> - unsigned long nextval, bestval;
> + unsigned long baseval, nextval, bestval;
> struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
> + char *module_base_name;
>
> /* At worse, next value is at end of module */
> - if (within_module_init(addr, mod))
> + if (within_module_init(addr, mod)) {
> + baseval = (unsigned long)mod->init_layout.base;
> nextval = (unsigned long)mod->init_layout.base+mod->init_layout.text_size;
> - else
> + module_base_name = "[module __init]";
> +
> + } else {
> + baseval = (unsigned long)mod->core_layout.base;
> nextval = (unsigned long)mod->core_layout.base+mod->core_layout.text_size;
> + module_base_name = "[module]";
> + }
>
> bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);
>
> @@ -4308,6 +4315,19 @@ static const char *find_kallsyms_symbol(struct module *mod,
> nextval = thisval;
> }
>
> + if ((is_module_text_address(addr) &&
> + (bestval < baseval || bestval > nextval))) {
> + /*
> + * return MODULE base and offset if we could not find
> + * any best match for text address
> + */
> + if (size)
> + *size = nextval - baseval;
> + if (offset)
> + *offset = addr - baseval;
> + return module_base_name;
> + }
> +
You did not explain why you change your code to not use the below
(!best) branch. I'd much prefer it there as that is when we know
for sure we have no real symbol defined.
Luis
>
> You did not explain why you change your code to not use the below
> (!best) branch. I'd much prefer it there as that is when we know
> for sure we have no real symbol defined.
>
> Luis
Actually I had it under (!best) in my first patch. I had to change it
because it was resolving the address to symbols like __this_module for
init address at times which is not correct for text address. It was
not entering inside if (!best) as it found some match but the match
was not correct. It could be fine for some non text addresses but not
for text addresses.
So I had to move this check out of (!best) and put a check explicitly
for text address to avoid any impact on non text addresses by
following:
+ if ((is_module_text_address(addr) &&
+ (bestval < baseval || bestval > nextval))) {
+ /*
+ * return MODULE base and offset if we could not find
+ * any best match for text address
+ */
I tested it on next-20220116-1-gff24014a52c0 today and I am able to
repro at least for init address easily with test_module.ko.
it is showing like following without the patch in my latest run (
decoding it to __this_module wrongly as mentioned earlier) :
[129558.843823] CPU: 2 PID: 39541 Comm: insmod Tainted: G W E
5.16.0-next-20220116+ #6
[129558.843827] Hardware name: innotek GmbH VirtualBox/VirtualBox,
BIOS VirtualBox 12/01/2006
[129558.843829] RIP: 0010:__this_module+0x9fc4/0x9fc7 [test_module].
<<<<<<<<<<<======================
[129558.843840] Code: Unable to access opcode bytes at RIP 0xffffffffc083ffda.
[129558.843841] RSP: 0018:ffffa2cc800cbbf0 EFLAGS: 00010202
[129558.843844] RAX: 0000000000000035 RBX: 0000000000000000 RCX:
0000000000000000
[129558.843846] RDX: 0000000000000000 RSI: ffffffffb6d930f1 RDI:
0000000000000001
[129558.843848] RBP: ffffa2cc800cbbf8 R08: 0000000000000000 R09:
ffffa2cc800cb9f0
[129558.843849] R10: ffffa2cc800cb9e8 R11: ffffffffb7155108 R12:
ffffffffc0840007
[129558.843851] R13: ffff97acb458d580 R14: 0000000000000000 R15:
ffffffffc0836040
[129558.843853] FS: 00007f752eb90400(0000) GS:ffff97ad80f00000(0000)
knlGS:0000000000000000
[129558.843855] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[129558.843857] CR2: ffffffffc083ffda CR3: 0000000100786001 CR4:
00000000000706e0
[129558.843863] Call Trace:
[129558.843865] <TASK>
[129558.843866] ? init_module+0x40/0xff9 [test_module]
Not sure what has changed but it is not showing the absolute address
as I had seen earlier.
post my patch, it is showing like following:
[ 59.600299] CPU: 1 PID: 1620 Comm: insmod Tainted: G E
5.16.0-next-20220116+ #7
[ 59.600303] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[ 59.600305] RIP: 0010:[module __init]+0x4/0x7 [test_module]
<<<<<<====================
[ 59.600315] Code: Unable to access opcode bytes at RIP 0xffffffffc0827fda.
[ 59.600316] RSP: 0018:ffffbda002a8fc20 EFLAGS: 00010202
[ 59.600319] RAX: 0000000000000035 RBX: 0000000000000000 RCX: 0000000000000000
[ 59.600321] RDX: 0000000000000000 RSI: ffffffffa0f930f1 RDI: 0000000000000001
[ 59.600323] RBP: ffffbda002a8fc28 R08: 0000000000000000 R09: ffffbda002a8fa20
[ 59.600325] R10: ffffbda002a8fa18 R11: ffffffffa1355108 R12: ffffffffc0828007
[ 59.600326] R13: ffff9554c4e72220 R14: 0000000000000000 R15: ffffffffc0822040
[ 59.600328] FS: 00007f9c773d8400(0000) GS:ffff9555c0e80000(0000)
knlGS:0000000000000000
[ 59.600331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 59.600333] CR2: ffffffffc0827fda CR3: 000000000c19c003 CR4: 00000000000706e0
[ 59.600338] Call Trace:
[ 59.600340] <TASK>
[ 59.600342] ? init_module+0x40/0xff9 [test_module]
I can update the patch explaining this in comments in between the code.
Vimal
© 2016 - 2026 Red Hat, Inc.