From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
When booting with debug_pagealloc=on while having:
CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n
the system fails to boot due to page faults during kmemleak scanning.
Crash logs:
BUG: unable to handle page fault for address: ffff8880cd400000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 11de00067 P4D 11de00067 PUD 11af2b067 PMD 11aec1067 PTE 800fffff32bff020
Oops: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
RIP: 0010:scan_block+0x43/0xb0
Call Trace:
<TASK>
scan_gray_list+0x2b5/0x2f0
kmemleak_scan+0x3b1/0xcf0
kmemleak_scan_thread+0x7d/0xc0
kthread+0x11c/0x240
ret_from_fork+0x2d3/0x370
ret_from_fork_asm+0x11/0x20
</TASK>
This occurs because:
With debug_pagealloc enabled, __free_pages() invokes
debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for
freed pages in the direct mapping.
Commit 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
releases the KHO scratch region via init_cma_reserved_pageblock(),
unmapping its physical pages. Subsequent kmemleak scanning accesses
these unmapped pages, triggering fatal page faults.
This patch introduces kmemleak_no_scan_phys(phys_addr_t),
a physical-address variant of kmemleak_no_scan(), which marks
memblock regions as OBJECT_NO_SCAN.
We invoke this from kho_reserve_scratch() to exclude the reserved
region from scanning before it is released to the buddy allocator.
This is based linux next-20251119.
Ran Xiaokai (2):
mm: kmemleak: introduce kmemleak_no_scan_phys() helper
liveupdate: Fix boot failure due to kmemleak access to unmapped pages
include/linux/kmemleak.h | 4 ++++
kernel/liveupdate/kexec_handover.c | 4 ++++
mm/kmemleak.c | 15 ++++++++++++---
3 files changed, 20 insertions(+), 3 deletions(-)
--
2.25.1