[v10] arm64/riscv: Add support for crashkernel CMA reservation

[PATCH v10 0/8] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Jinjie Ruan 1 week, 1 day ago

The crash memory allocation, and the exclude of crashk_res, crashk_low_res
and crashk_cma memory are almost identical across different architectures,
This patch set handle them in crash core in a general way, which eliminate
a lot of duplication code.

And add support for crashkernel CMA reservation for arm64 and riscv.

Rebased on v7.0-rc1.

Basic second kernel boot test were performed on QEMU platforms for x86,
ARM64, and RISC-V architectures with the following parameters:

	"cma=256M crashkernel=256M crashkernel=64M,cma"

Changes in v10:
- Fix crashk_low_res not excluded bug in the existing
  RISC-V code.
- Fix an existing memory leak issue in the existing PowerPC code.
- Fix the ordering issue of adding CMA ranges to
  "linux,usable-memory-range".
- Fix an existing concurrency issue. A Concurrent memory hotplug may occur
  between reading memblock and attempting to fill cmem during kexec_load()
  for almost all existing architectures.
- Link to v9: https://lore.kernel.org/all/20260323072745.2481719-1-ruanjinjie@huawei.com/

Changes in v9:
- Collect Reviewed-by and Acked-by, and prepare for Sashiko AI review.
- Link to v8: https://lore.kernel.org/all/20260302035315.3892241-1-ruanjinjie@huawei.com/

Changes in v8:
- Fix the build issues reported by kernel test robot and Sourabh.
- Link to v7: https://lore.kernel.org/all/20260226130437.1867658-1-ruanjinjie@huawei.com/

Changes in v7:
- Correct the inclusion of CMA-reserved ranges for kdump kernel in of/kexec
  for arm64 and riscv.
- Add Acked-by.
- Link to v6: https://lore.kernel.org/all/20260224085342.387996-1-ruanjinjie@huawei.com/

Changes in v6:
- Update the crash core exclude code as Mike suggested.
- Rebased on v7.0-rc1.
- Add acked-by.
- Link to v5: https://lore.kernel.org/all/20260212101001.343158-1-ruanjinjie@huawei.com/

Changes in v5:
- Fix the kernel test robot build warnings.
- Sort crash memory ranges before preparing elfcorehdr for powerpc
- Link to v4: https://lore.kernel.org/all/20260209095931.2813152-1-ruanjinjie@huawei.com/

Changes in v4:
- Move the size calculation (and the realloc if needed) into the
  generic crash.
- Link to v3: https://lore.kernel.org/all/20260204093728.1447527-1-ruanjinjie@huawei.com/

Jinjie Ruan (7):
  riscv: kexec_file: Fix crashk_low_res not exclude bug
  powerpc/crash: Fix possible memory leak in update_crash_elfcorehdr()
  crash: Exclude crash kernel memory in crash core
  crash: Use crash_exclude_core_ranges() on powerpc
  arm64: kexec: Add support for crashkernel CMA reservation
  riscv: kexec: Add support for crashkernel CMA reservation
  crash: Fix race condition between crash kernel loading and memory
    hotplug

Sourabh Jain (1):
  powerpc/crash: sort crash memory ranges before preparing elfcorehdr

 .../admin-guide/kernel-parameters.txt         |  16 +--
 arch/arm64/kernel/machine_kexec_file.c        |  39 ++-----
 arch/arm64/mm/init.c                          |   5 +-
 arch/loongarch/kernel/machine_kexec_file.c    |  39 ++-----
 arch/powerpc/include/asm/kexec_ranges.h       |   1 -
 arch/powerpc/kexec/crash.c                    |   7 +-
 arch/powerpc/kexec/ranges.c                   | 101 +----------------
 arch/riscv/kernel/machine_kexec_file.c        |  38 ++-----
 arch/riscv/mm/init.c                          |   5 +-
 arch/x86/kernel/crash.c                       |  89 ++-------------
 drivers/of/fdt.c                              |   9 +-
 drivers/of/kexec.c                            |   9 ++
 include/linux/crash_core.h                    |   9 ++
 kernel/crash_core.c                           | 105 +++++++++++++++++-
 14 files changed, 195 insertions(+), 277 deletions(-)

-- 
2.34.1

Re: [PATCH v10 0/8] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Andrew Morton 1 week ago

On Wed, 25 Mar 2026 10:58:56 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote:

> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
> and crashk_cma memory are almost identical across different architectures,
> This patch set handle them in crash core in a general way, which eliminate
> a lot of duplication code.
> 
> And add support for crashkernel CMA reservation for arm64 and riscv.

So who is patchmonkey for this.

>  .../admin-guide/kernel-parameters.txt         |  16 +--
>  arch/arm64/kernel/machine_kexec_file.c        |  39 ++-----
>  arch/arm64/mm/init.c                          |   5 +-
>  arch/loongarch/kernel/machine_kexec_file.c    |  39 ++-----
>  arch/powerpc/include/asm/kexec_ranges.h       |   1 -
>  arch/powerpc/kexec/crash.c                    |   7 +-
>  arch/powerpc/kexec/ranges.c                   | 101 +----------------
>  arch/riscv/kernel/machine_kexec_file.c        |  38 ++-----
>  arch/riscv/mm/init.c                          |   5 +-
>  arch/x86/kernel/crash.c                       |  89 ++-------------
>  drivers/of/fdt.c                              |   9 +-
>  drivers/of/kexec.c                            |   9 ++
>  include/linux/crash_core.h                    |   9 ++
>  kernel/crash_core.c                           | 105 +++++++++++++++++-

Me, I guess, with as many arch acks as I can gather, please.

I'm seriously trying to slow things down now, but I guess I can make an
exception for non-MM material.

AI review asks a few questions:
	https://sashiko.dev/#/patchset/20260325025904.2811960-1-ruanjinjie@huawei.com

Can you please check these?  And I'm interested in learning how many of
these are valid.  Thanks.

Re: [PATCH v10 0/8] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Jinjie Ruan 1 week ago


On 2026/3/26 12:00, Andrew Morton wrote:
> On Wed, 25 Mar 2026 10:58:56 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote:
> 
>> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
>> and crashk_cma memory are almost identical across different architectures,
>> This patch set handle them in crash core in a general way, which eliminate
>> a lot of duplication code.
>>
>> And add support for crashkernel CMA reservation for arm64 and riscv.
> 
> So who is patchmonkey for this.
> 
>>  .../admin-guide/kernel-parameters.txt         |  16 +--
>>  arch/arm64/kernel/machine_kexec_file.c        |  39 ++-----
>>  arch/arm64/mm/init.c                          |   5 +-
>>  arch/loongarch/kernel/machine_kexec_file.c    |  39 ++-----
>>  arch/powerpc/include/asm/kexec_ranges.h       |   1 -
>>  arch/powerpc/kexec/crash.c                    |   7 +-
>>  arch/powerpc/kexec/ranges.c                   | 101 +----------------
>>  arch/riscv/kernel/machine_kexec_file.c        |  38 ++-----
>>  arch/riscv/mm/init.c                          |   5 +-
>>  arch/x86/kernel/crash.c                       |  89 ++-------------
>>  drivers/of/fdt.c                              |   9 +-
>>  drivers/of/kexec.c                            |   9 ++
>>  include/linux/crash_core.h                    |   9 ++
>>  kernel/crash_core.c                           | 105 +++++++++++++++++-
> 
> Me, I guess, with as many arch acks as I can gather, please.
> 
> I'm seriously trying to slow things down now, but I guess I can make an
> exception for non-MM material.
> 
> AI review asks a few questions:
> 	https://sashiko.dev/#/patchset/20260325025904.2811960-1-ruanjinjie@huawei.com
> 
> Can you please check these?  And I'm interested in learning how many of
> these are valid.  Thanks.

Thanks for the feedback. At the very least, the issue highlighted below
remains valid and needs to be addressed, which can be fixed with below
fixed number usable ranges.

+#define MAX_USABLE_RANGES		(6)

"
> */
> -#define MAX_USABLE_RANGES		2
> +#define MAX_USABLE_RANGES		(2 + CRASHKERNEL_CMA_RANGES_MAX)
Could this silently drop crash memory if the crash kernel is built without
CONFIG_CMA?
If the main kernel is compiled with CONFIG_CMA, it might append up to 6
regions to the linux,usable-memory-range property (2 standard + 4 CMA).
If the crash kernel is compiled without CONFIG_CMA,
CRASHKERNEL_CMA_RANGES_MAX
evaluates to 0. During boot, the crash kernel's FDT parsing logic limits the
ranges to MAX_USABLE_RANGES (which would be 2). This truncates the ranges
and drops all the CMA-backed crash memory.
Since the crash kernel only registers these regions as normal RAM, it
doesn't
require CONFIG_CMA to use them. Should MAX_USABLE_RANGES unconditionally
accommodate the maximum possible ranges any sender might provide?"



Regarding the concurrency issue mentioned below, as noted in my v9
response, could we confirm if this is indeed a bug? If so, I believe
each affected architecture might need a separate fix, as this issue
already exists in the current code and likely requires a more
comprehensive solution. Does anyone have thoughts on whether this is a
valid concern or suggestions on how to address it?


Link:
https://lore.kernel.org/all/4cfde40c-673a-12b0-dfc5-703d582d6ea9@huawei.com/


"
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -326,15 +326,25 @@ int crash_prepare_headers(int need_kernel_map,
void **addr, unsigned long *sz,
>  	struct crash_mem *cmem;
>  	int ret;
>
> +	if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
> +		lock_device_hotplug();
Can this cause a self-deadlock during memory hotplug?
When memory is onlined or offlined via sysfs (for example, in state_store()
calling lock_device_hotplug_sysfs()), the device_hotplug_lock is already
acquired before initiating the hotplug sequence.
Since crash_prepare_headers() can be invoked by the crash_memhp_notifier
during these memory hotplug operations, acquiring device_hotplug_lock again
here would cause a recursive mutex acquisition.
> +
>  	max_nr_ranges = arch_get_system_nr_ranges();
> -	if (!max_nr_ranges)
> -		return -ENOMEM;
> +	if (!max_nr_ranges) {
> +		ret = -ENOMEM;
> +		goto unlock;
> +	}
>
>  	cmem = alloc_cmem(max_nr_ranges);
> -	if (!cmem)
> -		return -ENOMEM;
> +	if (!cmem) {
> +		ret = -ENOMEM;
> +		goto unlock;
> +	}
>
>  	ret = arch_crash_populate_cmem(cmem);
While the locking attempts to address the race, is it possible that the
architecture specific population callbacks could still write out of bounds?
The callbacks appear to unconditionally write to
cmem->ranges[cmem->nr_ranges]
without verifying if cmem->nr_ranges >= cmem->max_nr_ranges.
Would it be safer to also add explicit bounds checking inside the populate
callbacks to return an error like -ENOMEM when the array capacity is
exceeded?"

>