The profile_pc() try to get pc by doing a trick to read
the contents of the stack. This may cause false positives
for KASAN, like the following:
BUG: KASAN: stack-out-of-bounds in profile_pc+0x5b/0x90
Read of size 8 at addr ffff8881062a7a00 by task id/130040
CPU: 1 PID: 130040 Comm: id Kdump: loaded Not tainted 5.15.93+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0x4c/0x64
? profile_pc+0x5b/0x90
print_address_description.constprop.8.cold.12+0x10/0x36b
? profile_pc+0x5b/0x90
? profile_pc+0x5b/0x90
? tick_sched_handle.isra.20+0xa0/0xa0
kasan_report.cold.13+0x7f/0x11b
? scheduler_tick+0x30/0x150
? profile_pc+0x5b/0x90
? _raw_spin_lock+0x82/0xd0
profile_pc+0x5b/0x90
profile_tick+0x78/0xb0
? tick_sched_handle.isra.20+0x83/0xa0
tick_sched_timer+0x94/0xb0
? enqueue_hrtimer+0x100/0x100
? _raw_write_lock_irqsave+0xd0/0xd0
? recalibrate_cpu_khz+0x10/0x10
? ktime_get_update_offsets_now+0x148/0x1a0
hrtimer_interrupt+0x1b9/0x390
? sched_ttwu_pending+0xf1/0x150
__sysvec_apic_timer_interrupt+0x7c/0x150
sysvec_apic_timer_interrupt+0x61/0x80
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:_raw_spin_lock+0x82/0xd0
The KASAN checking is already disabled in the ORC unwinder,
so let's make profile_pc() use arch_stack_walk() to get pc,
which fixes the above BUG and also avoids open-coding of
unwind logic.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
arch/x86/kernel/time.c | 36 +++++++++++++++++-------------------
1 file changed, 17 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index e42faa792c07..eee884306d36 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -17,6 +17,7 @@
#include <linux/i8253.h>
#include <linux/time.h>
#include <linux/export.h>
+#include <linux/stacktrace.h>
#include <asm/vsyscall.h>
#include <asm/x86_init.h>
@@ -25,27 +26,24 @@
#include <asm/hpet.h>
#include <asm/time.h>
+static bool profile_pc_cb(void *arg, unsigned long pc)
+{
+ unsigned long *prof_pc = arg;
+
+ if (in_lock_functions(pc))
+ return true;
+
+ *prof_pc = pc;
+ return false;
+}
+
unsigned long profile_pc(struct pt_regs *regs)
{
- unsigned long pc = instruction_pointer(regs);
-
- if (!user_mode(regs) && in_lock_functions(pc)) {
-#ifdef CONFIG_FRAME_POINTER
- return *(unsigned long *)(regs->bp + sizeof(long));
-#else
- unsigned long *sp = (unsigned long *)regs->sp;
- /*
- * Return address is either directly at stack pointer
- * or above a saved flags. Eflags has bits 22-31 zero,
- * kernel addresses don't.
- */
- if (sp[0] >> 22)
- return sp[0];
- if (sp[1] >> 22)
- return sp[1];
-#endif
- }
- return pc;
+ unsigned long prof_pc = 0;
+
+ arch_stack_walk(profile_pc_cb, &prof_pc, current, regs);
+
+ return prof_pc;
}
EXPORT_SYMBOL(profile_pc);
--
2.20.1
On Thu, Mar 30, 2023 at 04:15:51PM +0800, Qi Zheng wrote: > The profile_pc() try to get pc by doing a trick to read > the contents of the stack. This may cause false positives > for KASAN, like the following: > > BUG: KASAN: stack-out-of-bounds in profile_pc+0x5b/0x90 > Read of size 8 at addr ffff8881062a7a00 by task id/130040 I don't think this was actually a false positive. The !FRAME_POINTER code in profile_pc() has been badly broken for many years. BTW, there was a similar patch here: https://lore.kernel.org/lkml/20230224021858.120078-1-chenzhongjin@huawei.com/ I thought CONFIG_PROFILING was obsolete but Andi said previously he wants to keep it for at least boot-time profiling. Andi did suggest removing the lock profiling hacks, which means all the profile_pc() implementations can just be removed in favor of the generic instruction_pointer(). -- Josh
On 2023/4/8 12:56, Josh Poimboeuf wrote: > On Thu, Mar 30, 2023 at 04:15:51PM +0800, Qi Zheng wrote: >> The profile_pc() try to get pc by doing a trick to read >> the contents of the stack. This may cause false positives >> for KASAN, like the following: >> >> BUG: KASAN: stack-out-of-bounds in profile_pc+0x5b/0x90 >> Read of size 8 at addr ffff8881062a7a00 by task id/130040 > > I don't think this was actually a false positive. The !FRAME_POINTER > code in profile_pc() has been badly broken for many years. > > BTW, there was a similar patch here: > > https://lore.kernel.org/lkml/20230224021858.120078-1-chenzhongjin@huawei.com/ Ah. > > I thought CONFIG_PROFILING was obsolete but Andi said previously he > wants to keep it for at least boot-time profiling. > > Andi did suggest removing the lock profiling hacks, which means all the > profile_pc() implementations can just be removed in favor of the generic > instruction_pointer(). That's great, and I see Chen Zhongjin will send a new patch for this, let him continue this work. :) > -- Thanks, Qi
© 2016 - 2026 Red Hat, Inc.