[v9] arm64/riscv: Add support for crashkernel CMA reservation

[PATCH v9 0/5] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Jinjie Ruan 1 week, 4 days ago

The crash memory allocation, and the exclude of crashk_res, crashk_low_res
and crashk_cma memory are almost identical across different architectures,
This patch set handle them in crash core in a general way, which eliminate
a lot of duplication code.

And add support for crashkernel CMA reservation for arm64 and riscv.

Rebased on v7.0-rc1.

Basic second kernel boot test were performed on QEMU platforms for x86,
ARM64, and RISC-V architectures with the following parameters:

	"cma=256M crashkernel=256M crashkernel=64M,cma"

Changes in v9:
- Collect Reviewed-by and Acked-by, and prepare for Sashiko AI review.
- Link to v8: https://lore.kernel.org/all/20260302035315.3892241-1-ruanjinjie@huawei.com/

Changes in v8:
- Fix the build issues reported by kernel test robot and Sourabh.
- Link to v7: https://lore.kernel.org/all/20260226130437.1867658-1-ruanjinjie@huawei.com/

Changes in v7:
- Correct the inclusion of CMA-reserved ranges for kdump kernel in of/kexec
  for arm64 and riscv.
- Add Acked-by.
- Link to v6: https://lore.kernel.org/all/20260224085342.387996-1-ruanjinjie@huawei.com/

Changes in v6:
- Update the crash core exclude code as Mike suggested.
- Rebased on v7.0-rc1.
- Add acked-by.
- Link to v5: https://lore.kernel.org/all/20260212101001.343158-1-ruanjinjie@huawei.com/

Changes in v5:
- Fix the kernel test robot build warnings.
- Sort crash memory ranges before preparing elfcorehdr for powerpc
- Link to v4: https://lore.kernel.org/all/20260209095931.2813152-1-ruanjinjie@huawei.com/

Changes in v4:
- Move the size calculation (and the realloc if needed) into the
  generic crash.
- Link to v3: https://lore.kernel.org/all/20260204093728.1447527-1-ruanjinjie@huawei.com/

Jinjie Ruan (4):
  crash: Exclude crash kernel memory in crash core
  crash: Use crash_exclude_core_ranges() on powerpc
  arm64: kexec: Add support for crashkernel CMA reservation
  riscv: kexec: Add support for crashkernel CMA reservation

Sourabh Jain (1):
  powerpc/crash: sort crash memory ranges before preparing elfcorehdr

 .../admin-guide/kernel-parameters.txt         |  16 +--
 arch/arm64/kernel/machine_kexec_file.c        |  39 +++----
 arch/arm64/mm/init.c                          |   5 +-
 arch/loongarch/kernel/machine_kexec_file.c    |  39 +++----
 arch/powerpc/include/asm/kexec_ranges.h       |   1 -
 arch/powerpc/kexec/crash.c                    |   5 +-
 arch/powerpc/kexec/ranges.c                   | 101 +-----------------
 arch/riscv/kernel/machine_kexec_file.c        |  38 +++----
 arch/riscv/mm/init.c                          |   5 +-
 arch/x86/kernel/crash.c                       |  89 +++------------
 drivers/of/fdt.c                              |   9 +-
 drivers/of/kexec.c                            |   9 ++
 include/linux/crash_core.h                    |   9 ++
 kernel/crash_core.c                           |  89 ++++++++++++++-
 14 files changed, 178 insertions(+), 276 deletions(-)

-- 
2.34.1

Re: [PATCH v9 0/5] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Andrew Morton 1 week, 4 days ago

On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote:

> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
> and crashk_cma memory are almost identical across different architectures,
> This patch set handle them in crash core in a general way, which eliminate
> a lot of duplication code.
> 
> And add support for crashkernel CMA reservation for arm64 and riscv.

Thanks.  AI review has completed and it asks questions:
	https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com

Re: [PATCH v9 0/5] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Jinjie Ruan 1 week, 3 days ago


On 2026/3/24 0:55, Andrew Morton wrote:
> On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote:
> 
>> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
>> and crashk_cma memory are almost identical across different architectures,
>> This patch set handle them in crash core in a general way, which eliminate
>> a lot of duplication code.
>>
>> And add support for crashkernel CMA reservation for arm64 and riscv.
> 
> Thanks.  AI review has completed and it asks questions:
> 	https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com

I believe it identified 4 valid issues:

- The already discovered crashk_low_res not excluded bug in the existing
RISC-V code.

- An existing memory leak issue in the existing PowerPC code.

- The ordering issue of adding CMA ranges to "linux,usable-memory-range".

- An existing concurrency issue. A Concurrent memory hotplug may occur
between reading memblock and attempting to fill cmem during kexec_load()
for almost all existing architectures，I'm not sure if this is a
practical issue in reality..

 Race Condition Scenario

  Timeline:
  ---------------------------------------------------------------------
  T1: kexec_load() syscall starts
  T2: kexec_trylock() acquires kexec_lock
  T3: crash_prepare_headers() is called
  T4: arch_get_system_nr_ranges() queries memblock → finds 100 memory ranges
  T5: cmem = alloc_cmem(100) allocates buffer for 100 ranges
  T6: [RACE WINDOW] Another process triggers memory hotplug
  T7: add_memory() → lock_device_hotplug() → memblock_add_node()
  T8: New memory region added to memblock
  T9: arch_crash_populate_cmem() iterates: now finds 102 ranges
  T10: cmem->ranges[100] → OUT OF BOUNDS WRITE!
  T11: cmem->ranges[101] → OUT OF BOUNDS WRITE!
  T12: Kernel crash or memory corruption

  Why This Happens

  1. Different locks used:
    - kexec_load() uses kexec_trylock (atomic_t)
    - Memory hotplug uses device_hotplug_lock (mutex)
  2. No synchronization between these two operations
  3. Time-of-check to time-of-use (TOCTOU) issue:
    - Step T4-T5: We query the number of ranges and allocate buffer
    - Step T6-T9: Memory hotplug adds new ranges between query and
population



Any comments or suggestions on the following approach?


int crash_prepare_headers(...)
  {
      unsigned int max_nr_ranges;
      struct crash_mem *cmem;
      int ret;

      lock_device_hotplug();

      max_nr_ranges = arch_get_system_nr_ranges();
      // ...
      ret = arch_crash_populate_cmem(cmem);
      // ...

      unlock_device_hotplug();
      return ret;
  }


>

Re: [PATCH v9 0/5] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Sourabh Jain 1 week, 3 days ago


On 24/03/26 09:32, Jinjie Ruan wrote:
>
> On 2026/3/24 0:55, Andrew Morton wrote:
>> On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan <ruanjinjie@huawei.com> wrote:
>>
>>> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
>>> and crashk_cma memory are almost identical across different architectures,
>>> This patch set handle them in crash core in a general way, which eliminate
>>> a lot of duplication code.
>>>
>>> And add support for crashkernel CMA reservation for arm64 and riscv.
>> Thanks.  AI review has completed and it asks questions:
>> 	https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com
> I believe it identified 4 valid issues:
>
> - The already discovered crashk_low_res not excluded bug in the existing
> RISC-V code.
>
> - An existing memory leak issue in the existing PowerPC code.

Yes and suggested approach to fix the issue looks good.
Which is basically replace return with goto out.

diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 898742a5205c..1426d2099bad 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -440,7 +440,7 @@ static void update_crash_elfcorehdr(struct kimage 
*image, struct memory_notify *
         ret = get_crash_memory_ranges(&cmem);
         if (ret) {
                 pr_err("Failed to get crash mem range\n");
-               return;
+               goto out;
         }

         /*

Are you planning to handle this in this patch series? Or do you want me 
to send a separate fix patch?


>
> - The ordering issue of adding CMA ranges to "linux,usable-memory-range".
>
> - An existing concurrency issue. A Concurrent memory hotplug may occur
> between reading memblock and attempting to fill cmem during kexec_load()
> for almost all existing architectures，I'm not sure if this is a
> practical issue in reality..
>
>   Race Condition Scenario
>
>    Timeline:
>    ---------------------------------------------------------------------
>    T1: kexec_load() syscall starts
>    T2: kexec_trylock() acquires kexec_lock
>    T3: crash_prepare_headers() is called
>    T4: arch_get_system_nr_ranges() queries memblock → finds 100 memory ranges
>    T5: cmem = alloc_cmem(100) allocates buffer for 100 ranges
>    T6: [RACE WINDOW] Another process triggers memory hotplug
>    T7: add_memory() → lock_device_hotplug() → memblock_add_node()
>    T8: New memory region added to memblock
>    T9: arch_crash_populate_cmem() iterates: now finds 102 ranges
>    T10: cmem->ranges[100] → OUT OF BOUNDS WRITE!
>    T11: cmem->ranges[101] → OUT OF BOUNDS WRITE!
>    T12: Kernel crash or memory corruption
>
>    Why This Happens
>
>    1. Different locks used:
>      - kexec_load() uses kexec_trylock (atomic_t)
>      - Memory hotplug uses device_hotplug_lock (mutex)
>    2. No synchronization between these two operations
>    3. Time-of-check to time-of-use (TOCTOU) issue:
>      - Step T4-T5: We query the number of ranges and allocate buffer
>      - Step T6-T9: Memory hotplug adds new ranges between query and
> population
>
>
>
> Any comments or suggestions on the following approach?
>
>
> int crash_prepare_headers(...)
>    {
>        unsigned int max_nr_ranges;
>        struct crash_mem *cmem;
>        int ret;
>
>        lock_device_hotplug();
>
>        max_nr_ranges = arch_get_system_nr_ranges();
>        // ...
>        ret = arch_crash_populate_cmem(cmem);
>        // ...
>
>        unlock_device_hotplug();
>        return ret;
>    }
>
>

Re: [PATCH v9 0/5] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Jinjie Ruan 1 week, 3 days ago


On 2026/3/24 12:29, Sourabh Jain wrote:
> 
> 
> On 24/03/26 09:32, Jinjie Ruan wrote:
>>
>> On 2026/3/24 0:55, Andrew Morton wrote:
>>> On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan
>>> <ruanjinjie@huawei.com> wrote:
>>>
>>>> The crash memory allocation, and the exclude of crashk_res,
>>>> crashk_low_res
>>>> and crashk_cma memory are almost identical across different
>>>> architectures,
>>>> This patch set handle them in crash core in a general way, which
>>>> eliminate
>>>> a lot of duplication code.
>>>>
>>>> And add support for crashkernel CMA reservation for arm64 and riscv.
>>> Thanks.  AI review has completed and it asks questions:
>>>     https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie@huawei.com
>> I believe it identified 4 valid issues:
>>
>> - The already discovered crashk_low_res not excluded bug in the existing
>> RISC-V code.
>>
>> - An existing memory leak issue in the existing PowerPC code.
> 
> Yes and suggested approach to fix the issue looks good.
> Which is basically replace return with goto out.
> 
> diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
> index 898742a5205c..1426d2099bad 100644
> --- a/arch/powerpc/kexec/crash.c
> +++ b/arch/powerpc/kexec/crash.c
> @@ -440,7 +440,7 @@ static void update_crash_elfcorehdr(struct kimage
> *image, struct memory_notify *
>         ret = get_crash_memory_ranges(&cmem);
>         if (ret) {
>                 pr_err("Failed to get crash mem range\n");
> -               return;
> +               goto out;
>         }
> 
>         /*
> 
> Are you planning to handle this in this patch series? Or do you want me
> to send a separate fix patch?

Yes, will fix it in v10, thanks for the clarification.

Best regards,
Jinjie

> 
> 
>>
>> - The ordering issue of adding CMA ranges to "linux,usable-memory-range".
>>
>> - An existing concurrency issue. A Concurrent memory hotplug may occur
>> between reading memblock and attempting to fill cmem during kexec_load()
>> for almost all existing architectures，I'm not sure if this is a
>> practical issue in reality..

What are your thoughts on this concurrency issue?

>>
>>   Race Condition Scenario
>>
>>    Timeline:
>>    ---------------------------------------------------------------------
>>    T1: kexec_load() syscall starts
>>    T2: kexec_trylock() acquires kexec_lock
>>    T3: crash_prepare_headers() is called
>>    T4: arch_get_system_nr_ranges() queries memblock → finds 100 memory
>> ranges
>>    T5: cmem = alloc_cmem(100) allocates buffer for 100 ranges
>>    T6: [RACE WINDOW] Another process triggers memory hotplug
>>    T7: add_memory() → lock_device_hotplug() → memblock_add_node()
>>    T8: New memory region added to memblock
>>    T9: arch_crash_populate_cmem() iterates: now finds 102 ranges
>>    T10: cmem->ranges[100] → OUT OF BOUNDS WRITE!
>>    T11: cmem->ranges[101] → OUT OF BOUNDS WRITE!
>>    T12: Kernel crash or memory corruption
>>
>>    Why This Happens
>>
>>    1. Different locks used:
>>      - kexec_load() uses kexec_trylock (atomic_t)
>>      - Memory hotplug uses device_hotplug_lock (mutex)
>>    2. No synchronization between these two operations
>>    3. Time-of-check to time-of-use (TOCTOU) issue:
>>      - Step T4-T5: We query the number of ranges and allocate buffer
>>      - Step T6-T9: Memory hotplug adds new ranges between query and
>> population
>>
>>
>>
>> Any comments or suggestions on the following approach?
>>
>>
>> int crash_prepare_headers(...)
>>    {
>>        unsigned int max_nr_ranges;
>>        struct crash_mem *cmem;
>>        int ret;
>>
>>        lock_device_hotplug();
>>
>>        max_nr_ranges = arch_get_system_nr_ranges();
>>        // ...
>>        ret = arch_crash_populate_cmem(cmem);
>>        // ...
>>
>>        unlock_device_hotplug();
>>        return ret;
>>    }
>>
>>
>

Re: [PATCH v9 0/5] arm64/riscv: Add support for crashkernel CMA reservation

Posted by Askar Safin 1 week, 3 days ago

Please, remove me from CC list in future versions of this patchset

-- 
Askar Safin