[PATCH 2/2] liveupdate: Fix boot failure due to kmemleak access to unmapped pages

ranxiaokai627@163.com posted 2 patches 1 week, 4 days ago
[PATCH 2/2] liveupdate: Fix boot failure due to kmemleak access to unmapped pages
Posted by ranxiaokai627@163.com 1 week, 4 days ago
From: Ran Xiaokai <ran.xiaokai@zte.com.cn>

When booting with debug_pagealloc=on while having:
CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n
the system fails to boot due to page faults during kmemleak scanning.

This occurs because:
With debug_pagealloc enabled, __free_pages() invokes
debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for
freed pages in the direct mapping.
Commit 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
releases the KHO scratch region via init_cma_reserved_pageblock(),
unmapping its physical pages. Subsequent kmemleak scanning accesses
these unmapped pages, triggering fatal page faults.

Call kmemleak_no_scan_phys() from kho_reserve_scratch() to
exclude the reserved region from scanning before
it is released to the buddy allocator.

Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
---
 kernel/liveupdate/kexec_handover.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
index 224bdf5becb6..dd4942d1d76c 100644
--- a/kernel/liveupdate/kexec_handover.c
+++ b/kernel/liveupdate/kexec_handover.c
@@ -11,6 +11,7 @@
 
 #include <linux/cleanup.h>
 #include <linux/cma.h>
+#include <linux/kmemleak.h>
 #include <linux/count_zeros.h>
 #include <linux/kexec.h>
 #include <linux/kexec_handover.h>
@@ -654,6 +655,7 @@ static void __init kho_reserve_scratch(void)
 	if (!addr)
 		goto err_free_scratch_desc;
 
+	kmemleak_no_scan_phys(addr);
 	kho_scratch[i].addr = addr;
 	kho_scratch[i].size = size;
 	i++;
@@ -664,6 +666,7 @@ static void __init kho_reserve_scratch(void)
 	if (!addr)
 		goto err_free_scratch_areas;
 
+	kmemleak_no_scan_phys(addr);
 	kho_scratch[i].addr = addr;
 	kho_scratch[i].size = size;
 	i++;
@@ -676,6 +679,7 @@ static void __init kho_reserve_scratch(void)
 		if (!addr)
 			goto err_free_scratch_areas;
 
+		kmemleak_no_scan_phys(addr);
 		kho_scratch[i].addr = addr;
 		kho_scratch[i].size = size;
 		i++;
-- 
2.25.1
Re: [PATCH 2/2] liveupdate: Fix boot failure due to kmemleak access to unmapped pages
Posted by Mike Rapoport 1 week, 3 days ago
On Thu, Nov 20, 2025 at 02:41:47PM +0000, ranxiaokai627@163.com wrote:
> Subject: liveupdate: Fix boot failure due to kmemleak access to unmapped pages

Please prefix kexec handover patches with kho: rather than liveupdate.

> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> 
> When booting with debug_pagealloc=on while having:
> CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y
> CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n
> the system fails to boot due to page faults during kmemleak scanning.
> 
> This occurs because:
> With debug_pagealloc enabled, __free_pages() invokes
> debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for
> freed pages in the direct mapping.
> Commit 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
> releases the KHO scratch region via init_cma_reserved_pageblock(),
> unmapping its physical pages. Subsequent kmemleak scanning accesses
> these unmapped pages, triggering fatal page faults.
> 
> Call kmemleak_no_scan_phys() from kho_reserve_scratch() to
> exclude the reserved region from scanning before
> it is released to the buddy allocator.
> 
> Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> ---
>  kernel/liveupdate/kexec_handover.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 224bdf5becb6..dd4942d1d76c 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -11,6 +11,7 @@
>  
>  #include <linux/cleanup.h>
>  #include <linux/cma.h>
> +#include <linux/kmemleak.h>
>  #include <linux/count_zeros.h>
>  #include <linux/kexec.h>
>  #include <linux/kexec_handover.h>
> @@ -654,6 +655,7 @@ static void __init kho_reserve_scratch(void)
>  	if (!addr)
>  		goto err_free_scratch_desc;
>  
> +	kmemleak_no_scan_phys(addr);

There's kmemleak_ignore_phys() that can be called after the scratch areas
allocated from memblock and with that kmemleak should not access them.

Take a look at __cma_declare_contiguous_nid().

>  	kho_scratch[i].addr = addr;
>  	kho_scratch[i].size = size;
>  	i++;

-- 
Sincerely yours,
Mike.
Re: [PATCH 2/2] liveupdate: Fix boot failure due to kmemleak access to unmapped pages
Posted by ranxiaokai627@163.com 1 week, 2 days ago
>On Thu, Nov 20, 2025 at 02:41:47PM +0000, ranxiaokai627@163.com wrote:
>> Subject: liveupdate: Fix boot failure due to kmemleak access to unmapped pages
>
>Please prefix kexec handover patches with kho: rather than liveupdate.

Thanks for your review, i will update the patch subject.

>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>> 
>> When booting with debug_pagealloc=on while having:
>> CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y
>> CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n
>> the system fails to boot due to page faults during kmemleak scanning.
>> 
>> This occurs because:
>> With debug_pagealloc enabled, __free_pages() invokes
>> debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for
>> freed pages in the direct mapping.
>> Commit 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
>> releases the KHO scratch region via init_cma_reserved_pageblock(),
>> unmapping its physical pages. Subsequent kmemleak scanning accesses
>> these unmapped pages, triggering fatal page faults.
>> 
>> Call kmemleak_no_scan_phys() from kho_reserve_scratch() to
>> exclude the reserved region from scanning before
>> it is released to the buddy allocator.
>> 
>> Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
>> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>> ---
>>  kernel/liveupdate/kexec_handover.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>> 
>> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
>> index 224bdf5becb6..dd4942d1d76c 100644
>> --- a/kernel/liveupdate/kexec_handover.c
>> +++ b/kernel/liveupdate/kexec_handover.c
>> @@ -11,6 +11,7 @@
>>  
>>  #include <linux/cleanup.h>
>>  #include <linux/cma.h>
>> +#include <linux/kmemleak.h>
>>  #include <linux/count_zeros.h>
>>  #include <linux/kexec.h>
>>  #include <linux/kexec_handover.h>
>> @@ -654,6 +655,7 @@ static void __init kho_reserve_scratch(void)
>>  	if (!addr)
>>  		goto err_free_scratch_desc;
>>  
>> +	kmemleak_no_scan_phys(addr);
>
>There's kmemleak_ignore_phys() that can be called after the scratch areas
>allocated from memblock and with that kmemleak should not access them.
>
>Take a look at __cma_declare_contiguous_nid().

Thanks for catching this.
Since kmemleak_ignore_phys() perfectly handles this issue,
introducing another helper is unnecessary.
I'll post v2 shortly.

>>  	kho_scratch[i].addr = addr;
>>  	kho_scratch[i].size = size;
>>  	i++;
>
>-- 
>Sincerely yours,
>Mike.
Re: [PATCH 2/2] liveupdate: Fix boot failure due to kmemleak access to unmapped pages
Posted by Pratyush Yadav 1 week, 4 days ago
On Thu, Nov 20 2025, ranxiaokai627@163.com wrote:

> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>
> When booting with debug_pagealloc=on while having:
> CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y
> CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n
> the system fails to boot due to page faults during kmemleak scanning.
>
> This occurs because:
> With debug_pagealloc enabled, __free_pages() invokes
> debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for
> freed pages in the direct mapping.
> Commit 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
> releases the KHO scratch region via init_cma_reserved_pageblock(),
> unmapping its physical pages. Subsequent kmemleak scanning accesses
> these unmapped pages, triggering fatal page faults.

I don't know how kmemleak works. Why does kmemleak access the unmapped
pages? If pages are not mapped, it should learn to not access them,
right?

>
> Call kmemleak_no_scan_phys() from kho_reserve_scratch() to
> exclude the reserved region from scanning before
> it is released to the buddy allocator.

kho_reserve_scratch() is called on the first boot. It allocates the
scratch areas for subsequent boots. On every KHO boot after this,
kho_reserve_scratch() is not called and kho_release_scratch() is called
instead since the scratch areas already exist from previous boot.

Eventually both paths converge to kho_init() and call
init_cma_reserved_pageblock().

So shouldn't you call kmemleak_no_scan_phys() from kho_init() instead?
This would reduce code duplication and cover both paths.

>
> Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> ---
>  kernel/liveupdate/kexec_handover.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 224bdf5becb6..dd4942d1d76c 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -11,6 +11,7 @@
>  
>  #include <linux/cleanup.h>
>  #include <linux/cma.h>
> +#include <linux/kmemleak.h>
>  #include <linux/count_zeros.h>
>  #include <linux/kexec.h>
>  #include <linux/kexec_handover.h>
> @@ -654,6 +655,7 @@ static void __init kho_reserve_scratch(void)
>  	if (!addr)
>  		goto err_free_scratch_desc;
>  
> +	kmemleak_no_scan_phys(addr);
>  	kho_scratch[i].addr = addr;
>  	kho_scratch[i].size = size;
>  	i++;
> @@ -664,6 +666,7 @@ static void __init kho_reserve_scratch(void)
>  	if (!addr)
>  		goto err_free_scratch_areas;
>  
> +	kmemleak_no_scan_phys(addr);
>  	kho_scratch[i].addr = addr;
>  	kho_scratch[i].size = size;
>  	i++;
> @@ -676,6 +679,7 @@ static void __init kho_reserve_scratch(void)
>  		if (!addr)
>  			goto err_free_scratch_areas;
>  
> +		kmemleak_no_scan_phys(addr);
>  		kho_scratch[i].addr = addr;
>  		kho_scratch[i].size = size;
>  		i++;

-- 
Regards,
Pratyush Yadav
Re: [PATCH 2/2] liveupdate: Fix boot failure due to kmemleak access to unmapped pages
Posted by ranxiaokai627@163.com 1 week, 2 days ago
>On Thu, Nov 20 2025, ranxiaokai627@163.com wrote:
>
>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>
>> When booting with debug_pagealloc=on while having:
>> CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y
>> CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n
>> the system fails to boot due to page faults during kmemleak scanning.
>>
>> This occurs because:
>> With debug_pagealloc enabled, __free_pages() invokes
>> debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for
>> freed pages in the direct mapping.
>> Commit 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
>> releases the KHO scratch region via init_cma_reserved_pageblock(),
>> unmapping its physical pages. Subsequent kmemleak scanning accesses
>> these unmapped pages, triggering fatal page faults.
>
>I don't know how kmemleak works. Why does kmemleak access the unmapped
>pages? If pages are not mapped, it should learn to not access them,
>right?
>
>>
>> Call kmemleak_no_scan_phys() from kho_reserve_scratch() to
>> exclude the reserved region from scanning before
>> it is released to the buddy allocator.
>
>kho_reserve_scratch() is called on the first boot. It allocates the
>scratch areas for subsequent boots. On every KHO boot after this,
>kho_reserve_scratch() is not called and kho_release_scratch() is called
>instead since the scratch areas already exist from previous boot.
>
>Eventually both paths converge to kho_init() and call
>init_cma_reserved_pageblock().
>
>So shouldn't you call kmemleak_no_scan_phys() from kho_init() instead?
>This would reduce code duplication and cover both paths.

Thanks for your review!

Yes, both paths converge to kho_init(),
for the first boot, kho_get_fdt() returns NULL and
init_cma_reserved_pageblock() is called, but for KHO boot,
kho_get_fdt() returns non-NULL, kho_init() returns before
calling init_cma_reserved_pageblock().

However, in KHO boot, calling kmemleak_no_scan_phys() is unnecessary
because kmemleak objects are created when called memblock_phys_alloc() and
KHO boot does not invoke memblock_phys_alloc(),
moving the kmemleak_no_scan_phys() call into kho_init() both resolves the issue
and reduces code duplication.