.../admin-guide/kernel-parameters.txt | 16 +-- arch/arm64/kernel/machine_kexec_file.c | 39 +++---- arch/arm64/mm/init.c | 5 +- arch/loongarch/kernel/machine_kexec_file.c | 39 +++---- arch/powerpc/include/asm/kexec_ranges.h | 1 - arch/powerpc/kexec/crash.c | 5 +- arch/powerpc/kexec/ranges.c | 101 +----------------- arch/riscv/kernel/machine_kexec_file.c | 38 +++---- arch/riscv/mm/init.c | 5 +- arch/x86/kernel/crash.c | 89 +++------------ drivers/of/fdt.c | 9 +- drivers/of/kexec.c | 9 ++ include/linux/crash_core.h | 9 ++ kernel/crash_core.c | 89 ++++++++++++++- 14 files changed, 178 insertions(+), 276 deletions(-)
The crash memory allocation, and the exclude of crashk_res, crashk_low_res and crashk_cma memory are almost identical across different architectures, This patch set handle them in crash core in a general way, which eliminate a lot of duplication code. And add support for crashkernel CMA reservation for arm64 and riscv. Rebased on v7.0-rc1. Basic second kernel boot test were performed on QEMU platforms for x86, ARM64, and RISC-V architectures with the following parameters: "cma=256M crashkernel=256M crashkernel=64M,cma" Changes in v9: - Collect Reviewed-by and Acked-by, and prepare for Sashiko AI review. - Link to v8: https://lore.kernel.org/all/20260302035315.3892241-1-ruanjinjie@huawei.com/ Changes in v8: - Fix the build issues reported by kernel test robot and Sourabh. - Link to v7: https://lore.kernel.org/all/20260226130437.1867658-1-ruanjinjie@huawei.com/ Changes in v7: - Correct the inclusion of CMA-reserved ranges for kdump kernel in of/kexec for arm64 and riscv. - Add Acked-by. - Link to v6: https://lore.kernel.org/all/20260224085342.387996-1-ruanjinjie@huawei.com/ Changes in v6: - Update the crash core exclude code as Mike suggested. - Rebased on v7.0-rc1. - Add acked-by. - Link to v5: https://lore.kernel.org/all/20260212101001.343158-1-ruanjinjie@huawei.com/ Changes in v5: - Fix the kernel test robot build warnings. - Sort crash memory ranges before preparing elfcorehdr for powerpc - Link to v4: https://lore.kernel.org/all/20260209095931.2813152-1-ruanjinjie@huawei.com/ Changes in v4: - Move the size calculation (and the realloc if needed) into the generic crash. - Link to v3: https://lore.kernel.org/all/20260204093728.1447527-1-ruanjinjie@huawei.com/ Jinjie Ruan (4): crash: Exclude crash kernel memory in crash core crash: Use crash_exclude_core_ranges() on powerpc arm64: kexec: Add support for crashkernel CMA reservation riscv: kexec: Add support for crashkernel CMA reservation Sourabh Jain (1): powerpc/crash: sort crash memory ranges before preparing elfcorehdr .../admin-guide/kernel-parameters.txt | 16 +-- arch/arm64/kernel/machine_kexec_file.c | 39 +++---- arch/arm64/mm/init.c | 5 +- arch/loongarch/kernel/machine_kexec_file.c | 39 +++---- arch/powerpc/include/asm/kexec_ranges.h | 1 - arch/powerpc/kexec/crash.c | 5 +- arch/powerpc/kexec/ranges.c | 101 +----------------- arch/riscv/kernel/machine_kexec_file.c | 38 +++---- arch/riscv/mm/init.c | 5 +- arch/x86/kernel/crash.c | 89 +++------------ drivers/of/fdt.c | 9 +- drivers/of/kexec.c | 9 ++ include/linux/crash_core.h | 9 ++ kernel/crash_core.c | 89 ++++++++++++++- 14 files changed, 178 insertions(+), 276 deletions(-) -- 2.34.1
On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote: > The crash memory allocation, and the exclude of crashk_res, crashk_low_res > and crashk_cma memory are almost identical across different architectures, > This patch set handle them in crash core in a general way, which eliminate > a lot of duplication code. > > And add support for crashkernel CMA reservation for arm64 and riscv. Thanks. AI review has completed and it asks questions: https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com
On 2026/3/24 0:55, Andrew Morton wrote:
> On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote:
>
>> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
>> and crashk_cma memory are almost identical across different architectures,
>> This patch set handle them in crash core in a general way, which eliminate
>> a lot of duplication code.
>>
>> And add support for crashkernel CMA reservation for arm64 and riscv.
>
> Thanks. AI review has completed and it asks questions:
> https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com
I believe it identified 4 valid issues:
- The already discovered crashk_low_res not excluded bug in the existing
RISC-V code.
- An existing memory leak issue in the existing PowerPC code.
- The ordering issue of adding CMA ranges to "linux,usable-memory-range".
- An existing concurrency issue. A Concurrent memory hotplug may occur
between reading memblock and attempting to fill cmem during kexec_load()
for almost all existing architectures,I'm not sure if this is a
practical issue in reality..
Race Condition Scenario
Timeline:
---------------------------------------------------------------------
T1: kexec_load() syscall starts
T2: kexec_trylock() acquires kexec_lock
T3: crash_prepare_headers() is called
T4: arch_get_system_nr_ranges() queries memblock → finds 100 memory ranges
T5: cmem = alloc_cmem(100) allocates buffer for 100 ranges
T6: [RACE WINDOW] Another process triggers memory hotplug
T7: add_memory() → lock_device_hotplug() → memblock_add_node()
T8: New memory region added to memblock
T9: arch_crash_populate_cmem() iterates: now finds 102 ranges
T10: cmem->ranges[100] → OUT OF BOUNDS WRITE!
T11: cmem->ranges[101] → OUT OF BOUNDS WRITE!
T12: Kernel crash or memory corruption
Why This Happens
1. Different locks used:
- kexec_load() uses kexec_trylock (atomic_t)
- Memory hotplug uses device_hotplug_lock (mutex)
2. No synchronization between these two operations
3. Time-of-check to time-of-use (TOCTOU) issue:
- Step T4-T5: We query the number of ranges and allocate buffer
- Step T6-T9: Memory hotplug adds new ranges between query and
population
Any comments or suggestions on the following approach?
int crash_prepare_headers(...)
{
unsigned int max_nr_ranges;
struct crash_mem *cmem;
int ret;
lock_device_hotplug();
max_nr_ranges = arch_get_system_nr_ranges();
// ...
ret = arch_crash_populate_cmem(cmem);
// ...
unlock_device_hotplug();
return ret;
}
>
On 24/03/26 09:32, Jinjie Ruan wrote:
>
> On 2026/3/24 0:55, Andrew Morton wrote:
>> On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote:
>>
>>> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
>>> and crashk_cma memory are almost identical across different architectures,
>>> This patch set handle them in crash core in a general way, which eliminate
>>> a lot of duplication code.
>>>
>>> And add support for crashkernel CMA reservation for arm64 and riscv.
>> Thanks. AI review has completed and it asks questions:
>> https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com
> I believe it identified 4 valid issues:
>
> - The already discovered crashk_low_res not excluded bug in the existing
> RISC-V code.
>
> - An existing memory leak issue in the existing PowerPC code.
Yes and suggested approach to fix the issue looks good.
Which is basically replace return with goto out.
diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 898742a5205c..1426d2099bad 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -440,7 +440,7 @@ static void update_crash_elfcorehdr(struct kimage
*image, struct memory_notify *
ret = get_crash_memory_ranges(&cmem);
if (ret) {
pr_err("Failed to get crash mem range\n");
- return;
+ goto out;
}
/*
Are you planning to handle this in this patch series? Or do you want me
to send a separate fix patch?
>
> - The ordering issue of adding CMA ranges to "linux,usable-memory-range".
>
> - An existing concurrency issue. A Concurrent memory hotplug may occur
> between reading memblock and attempting to fill cmem during kexec_load()
> for almost all existing architectures,I'm not sure if this is a
> practical issue in reality..
>
> Race Condition Scenario
>
> Timeline:
> ---------------------------------------------------------------------
> T1: kexec_load() syscall starts
> T2: kexec_trylock() acquires kexec_lock
> T3: crash_prepare_headers() is called
> T4: arch_get_system_nr_ranges() queries memblock → finds 100 memory ranges
> T5: cmem = alloc_cmem(100) allocates buffer for 100 ranges
> T6: [RACE WINDOW] Another process triggers memory hotplug
> T7: add_memory() → lock_device_hotplug() → memblock_add_node()
> T8: New memory region added to memblock
> T9: arch_crash_populate_cmem() iterates: now finds 102 ranges
> T10: cmem->ranges[100] → OUT OF BOUNDS WRITE!
> T11: cmem->ranges[101] → OUT OF BOUNDS WRITE!
> T12: Kernel crash or memory corruption
>
> Why This Happens
>
> 1. Different locks used:
> - kexec_load() uses kexec_trylock (atomic_t)
> - Memory hotplug uses device_hotplug_lock (mutex)
> 2. No synchronization between these two operations
> 3. Time-of-check to time-of-use (TOCTOU) issue:
> - Step T4-T5: We query the number of ranges and allocate buffer
> - Step T6-T9: Memory hotplug adds new ranges between query and
> population
>
>
>
> Any comments or suggestions on the following approach?
>
>
> int crash_prepare_headers(...)
> {
> unsigned int max_nr_ranges;
> struct crash_mem *cmem;
> int ret;
>
> lock_device_hotplug();
>
> max_nr_ranges = arch_get_system_nr_ranges();
> // ...
> ret = arch_crash_populate_cmem(cmem);
> // ...
>
> unlock_device_hotplug();
> return ret;
> }
>
>
On 2026/3/24 12:29, Sourabh Jain wrote:
>
>
> On 24/03/26 09:32, Jinjie Ruan wrote:
>>
>> On 2026/3/24 0:55, Andrew Morton wrote:
>>> On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan
>>> <ruanjinjie@huawei.com> wrote:
>>>
>>>> The crash memory allocation, and the exclude of crashk_res,
>>>> crashk_low_res
>>>> and crashk_cma memory are almost identical across different
>>>> architectures,
>>>> This patch set handle them in crash core in a general way, which
>>>> eliminate
>>>> a lot of duplication code.
>>>>
>>>> And add support for crashkernel CMA reservation for arm64 and riscv.
>>> Thanks. AI review has completed and it asks questions:
>>> https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com
>> I believe it identified 4 valid issues:
>>
>> - The already discovered crashk_low_res not excluded bug in the existing
>> RISC-V code.
>>
>> - An existing memory leak issue in the existing PowerPC code.
>
> Yes and suggested approach to fix the issue looks good.
> Which is basically replace return with goto out.
>
> diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
> index 898742a5205c..1426d2099bad 100644
> --- a/arch/powerpc/kexec/crash.c
> +++ b/arch/powerpc/kexec/crash.c
> @@ -440,7 +440,7 @@ static void update_crash_elfcorehdr(struct kimage
> *image, struct memory_notify *
> ret = get_crash_memory_ranges(&cmem);
> if (ret) {
> pr_err("Failed to get crash mem range\n");
> - return;
> + goto out;
> }
>
> /*
>
> Are you planning to handle this in this patch series? Or do you want me
> to send a separate fix patch?
Yes, will fix it in v10, thanks for the clarification.
Best regards,
Jinjie
>
>
>>
>> - The ordering issue of adding CMA ranges to "linux,usable-memory-range".
>>
>> - An existing concurrency issue. A Concurrent memory hotplug may occur
>> between reading memblock and attempting to fill cmem during kexec_load()
>> for almost all existing architectures,I'm not sure if this is a
>> practical issue in reality..
What are your thoughts on this concurrency issue?
>>
>> Race Condition Scenario
>>
>> Timeline:
>> ---------------------------------------------------------------------
>> T1: kexec_load() syscall starts
>> T2: kexec_trylock() acquires kexec_lock
>> T3: crash_prepare_headers() is called
>> T4: arch_get_system_nr_ranges() queries memblock → finds 100 memory
>> ranges
>> T5: cmem = alloc_cmem(100) allocates buffer for 100 ranges
>> T6: [RACE WINDOW] Another process triggers memory hotplug
>> T7: add_memory() → lock_device_hotplug() → memblock_add_node()
>> T8: New memory region added to memblock
>> T9: arch_crash_populate_cmem() iterates: now finds 102 ranges
>> T10: cmem->ranges[100] → OUT OF BOUNDS WRITE!
>> T11: cmem->ranges[101] → OUT OF BOUNDS WRITE!
>> T12: Kernel crash or memory corruption
>>
>> Why This Happens
>>
>> 1. Different locks used:
>> - kexec_load() uses kexec_trylock (atomic_t)
>> - Memory hotplug uses device_hotplug_lock (mutex)
>> 2. No synchronization between these two operations
>> 3. Time-of-check to time-of-use (TOCTOU) issue:
>> - Step T4-T5: We query the number of ranges and allocate buffer
>> - Step T6-T9: Memory hotplug adds new ranges between query and
>> population
>>
>>
>>
>> Any comments or suggestions on the following approach?
>>
>>
>> int crash_prepare_headers(...)
>> {
>> unsigned int max_nr_ranges;
>> struct crash_mem *cmem;
>> int ret;
>>
>> lock_device_hotplug();
>>
>> max_nr_ranges = arch_get_system_nr_ranges();
>> // ...
>> ret = arch_crash_populate_cmem(cmem);
>> // ...
>>
>> unlock_device_hotplug();
>> return ret;
>> }
>>
>>
>
© 2016 - 2026 Red Hat, Inc.