[PATCH -next v5 1/2] riscv: kdump: Implement crashkernel=X,[high,low]

Chen Jiahao posted 2 patches 2 years, 9 months ago
There is a newer version of this series
[PATCH -next v5 1/2] riscv: kdump: Implement crashkernel=X,[high,low]
Posted by Chen Jiahao 2 years, 9 months ago
On riscv, the current crash kernel allocation logic is trying to
allocate within 32bit addressible memory region by default, if
failed, try to allocate without 4G restriction.

In need of saving DMA zone memory while allocating a relatively large
crash kernel region, allocating the reserved memory top down in
high memory, without overlapping the DMA zone, is a mature solution.
Here introduce the parameter option crashkernel=X,[high,low].

One can reserve the crash kernel from high memory above DMA zone range
by explicitly passing "crashkernel=X,high"; or reserve a memory range
below 4G with "crashkernel=X,low".

Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com>
Acked-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 arch/riscv/kernel/setup.c |  5 +++
 arch/riscv/mm/init.c      | 73 +++++++++++++++++++++++++++++++++++----
 2 files changed, 71 insertions(+), 7 deletions(-)

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 36b026057503..e0b7c1651d60 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -176,6 +176,11 @@ static void __init init_resources(void)
 		if (ret < 0)
 			goto error;
 	}
+	if (crashk_low_res.start != crashk_low_res.end) {
+		ret = add_resource(&iomem_resource, &crashk_low_res);
+		if (ret < 0)
+			goto error;
+	}
 #endif
 
 #ifdef CONFIG_CRASH_DUMP
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 747e5b1ef02d..910d2e8c3c77 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -1272,6 +1272,28 @@ static inline void setup_vm_final(void)
 }
 #endif /* CONFIG_MMU */
 
+/* Reserve 128M low memory by default for swiotlb buffer */
+#define DEFAULT_CRASH_KERNEL_LOW_SIZE	(128UL << 20)
+
+static int __init reserve_crashkernel_low(unsigned long long low_size)
+{
+	unsigned long long low_base;
+
+	low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit);
+	if (!low_base) {
+		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
+		return -ENOMEM;
+	}
+
+	pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+		low_base, low_base + low_size, low_size >> 20);
+
+	crashk_low_res.start = low_base;
+	crashk_low_res.end = low_base + low_size - 1;
+
+	return 0;
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -1283,8 +1305,11 @@ static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base = 0;
 	unsigned long long crash_size = 0;
+	unsigned long long crash_low_size = 0;
 	unsigned long search_start = memblock_start_of_DRAM();
-	unsigned long search_end = memblock_end_of_DRAM();
+	unsigned long search_end = (unsigned long)dma32_phys_limit;
+	char *cmdline = boot_command_line;
+	bool fixed_base = false;
 
 	int ret = 0;
 
@@ -1300,14 +1325,34 @@ static void __init reserve_crashkernel(void)
 		return;
 	}
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
-	if (ret || !crash_size)
+	if (ret == -ENOENT) {
+		/* Fallback to crashkernel=X,[high,low] */
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+
+		/*
+		 * crashkernel=Y,low is valid only when crashkernel=X,high
+		 * is passed.
+		 */
+		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
+		if (ret == -ENOENT)
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+		else if (ret)
+			return;
+
+		search_end = memblock_end_of_DRAM();
+	} else if (ret || !crash_size) {
+		/* Invalid argument value specified */
 		return;
+	}
 
 	crash_size = PAGE_ALIGN(crash_size);
 
 	if (crash_base) {
+		fixed_base = true;
 		search_start = crash_base;
 		search_end = crash_base + crash_size;
 	}
@@ -1320,17 +1365,31 @@ static void __init reserve_crashkernel(void)
 	 * swiotlb can work on the crash kernel.
 	 */
 	crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
-					       search_start,
-					       min(search_end, (unsigned long) SZ_4G));
+					       search_start, search_end);
 	if (crash_base == 0) {
-		/* Try again without restricting region to 32bit addressible memory */
+		if (fixed_base) {
+			pr_warn("crashkernel: allocating failed with given size@offset\n");
+			return;
+		}
+		search_end = memblock_end_of_DRAM();
+
+		/* Try again above the region of 32bit addressible memory */
 		crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
-						search_start, search_end);
+						       search_start, search_end);
 		if (crash_base == 0) {
 			pr_warn("crashkernel: couldn't allocate %lldKB\n",
 				crash_size >> 10);
 			return;
 		}
+
+		if (!crash_low_size)
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+	}
+
+	if ((crash_base > dma32_phys_limit - crash_low_size) &&
+	    crash_low_size && reserve_crashkernel_low(crash_low_size)) {
+		memblock_phys_free(crash_base, crash_size);
+		return;
 	}
 
 	pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
-- 
2.31.1
Re: [PATCH -next v5 1/2] riscv: kdump: Implement crashkernel=X,[high,low]
Posted by Baoquan He 2 years, 8 months ago
Hi Jiahao,

On 05/11/23 at 04:51pm, Chen Jiahao wrote:
......  
> @@ -1300,14 +1325,34 @@ static void __init reserve_crashkernel(void)
>  		return;
>  	}
>  
> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>  				&crash_size, &crash_base);
> -	if (ret || !crash_size)
> +	if (ret == -ENOENT) {
> +		/* Fallback to crashkernel=X,[high,low] */
> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> +		if (ret || !crash_size)
> +			return;
> +
> +		/*
> +		 * crashkernel=Y,low is valid only when crashkernel=X,high
> +		 * is passed.
> +		 */
> +		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
> +		if (ret == -ENOENT)
> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> +		else if (ret)
> +			return;
> +
> +		search_end = memblock_end_of_DRAM();
> +	} else if (ret || !crash_size) {
> +		/* Invalid argument value specified */
>  		return;
> +	}

The parsing part looks great, while you didn't mark if it's specified
high reservation, please see later comment why it's needed.

>  
>  	crash_size = PAGE_ALIGN(crash_size);
>  
>  	if (crash_base) {
> +		fixed_base = true;
>  		search_start = crash_base;
>  		search_end = crash_base + crash_size;
>  	}
> @@ -1320,17 +1365,31 @@ static void __init reserve_crashkernel(void)
>  	 * swiotlb can work on the crash kernel.
>  	 */
>  	crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
> -					       search_start,
> -					       min(search_end, (unsigned long) SZ_4G));
> +					       search_start, search_end);

If it's a specified high reservation, you have 
search_start = memblock_start_of_DRAM();
search_end = memblock_end_of_DRAM();

Then it attempts to search top down first time here.

>  	if (crash_base == 0) {
> -		/* Try again without restricting region to 32bit addressible memory */
> +		if (fixed_base) {
> +			pr_warn("crashkernel: allocating failed with given size@offset\n");
> +			return;
> +		}
> +		search_end = memblock_end_of_DRAM();
> +
> +		/* Try again above the region of 32bit addressible memory */
>  		crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
> -						search_start, search_end);
> +						       search_start, search_end);

If crashkernel=,high case, the first attempt failed, here it assigns
search_end with memblock_end_of_DRAM(). It's the exactly the same
attempt, why is that needed? Why don't you use a local variable 'high'
to mark the crashkernel=,hig, then judge when deciding how to adjsut the
reservation range.

Do I misunderstand the code? 

Thanks
Baoquan

>  		if (crash_base == 0) {
>  			pr_warn("crashkernel: couldn't allocate %lldKB\n",
>  				crash_size >> 10);
>  			return;
>  		}
> +
> +		if (!crash_low_size)
> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> +	}
> +
> +	if ((crash_base > dma32_phys_limit - crash_low_size) &&
> +	    crash_low_size && reserve_crashkernel_low(crash_low_size)) {
> +		memblock_phys_free(crash_base, crash_size);
> +		return;
>  	}
>  
>  	pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
> -- 
> 2.31.1
>
Re: [PATCH -next v5 1/2] riscv: kdump: Implement crashkernel=X,[high,low]
Posted by chenjiahao (C) 2 years, 8 months ago
On 2023/6/4 11:50, Baoquan He wrote:
> Hi Jiahao,
>
> On 05/11/23 at 04:51pm, Chen Jiahao wrote:
> ......
>> @@ -1300,14 +1325,34 @@ static void __init reserve_crashkernel(void)
>>   		return;
>>   	}
>>   
>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>   				&crash_size, &crash_base);
>> -	if (ret || !crash_size)
>> +	if (ret == -ENOENT) {
>> +		/* Fallback to crashkernel=X,[high,low] */
>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>> +		if (ret || !crash_size)
>> +			return;
>> +
>> +		/*
>> +		 * crashkernel=Y,low is valid only when crashkernel=X,high
>> +		 * is passed.
>> +		 */
>> +		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
>> +		if (ret == -ENOENT)
>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>> +		else if (ret)
>> +			return;
>> +
>> +		search_end = memblock_end_of_DRAM();
>> +	} else if (ret || !crash_size) {
>> +		/* Invalid argument value specified */
>>   		return;
>> +	}
> The parsing part looks great, while you didn't mark if it's specified
> high reservation, please see later comment why it's needed.
>
>>   
>>   	crash_size = PAGE_ALIGN(crash_size);
>>   
>>   	if (crash_base) {
>> +		fixed_base = true;
>>   		search_start = crash_base;
>>   		search_end = crash_base + crash_size;
>>   	}
>> @@ -1320,17 +1365,31 @@ static void __init reserve_crashkernel(void)
>>   	 * swiotlb can work on the crash kernel.
>>   	 */
>>   	crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
>> -					       search_start,
>> -					       min(search_end, (unsigned long) SZ_4G));
>> +					       search_start, search_end);
> If it's a specified high reservation, you have
> search_start = memblock_start_of_DRAM();
> search_end = memblock_end_of_DRAM();
>
> Then it attempts to search top down first time here.
>
>>   	if (crash_base == 0) {
>> -		/* Try again without restricting region to 32bit addressible memory */
>> +		if (fixed_base) {
>> +			pr_warn("crashkernel: allocating failed with given size@offset\n");
>> +			return;
>> +		}
>> +		search_end = memblock_end_of_DRAM();
>> +
>> +		/* Try again above the region of 32bit addressible memory */
>>   		crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
>> -						search_start, search_end);
>> +						       search_start, search_end);
> If crashkernel=,high case, the first attempt failed, here it assigns
> search_end with memblock_end_of_DRAM(). It's the exactly the same
> attempt, why is that needed? Why don't you use a local variable 'high'
> to mark the crashkernel=,hig, then judge when deciding how to adjsut the
> reservation range.
>
> Do I misunderstand the code?
>
> Thanks
> Baoquan

You are right. Here I use search_end = memblock_end_of_DRAM() for the
first attempt on "crashkernel=,high" case, but it will not distinct from
other cases if the first attempt fails.

I have read your latest refactor on Arm64, introducing the "high" flag
is a good choice, the logic gets more straightforward when handling
crashkernel=,high case and retrying.

Following that logic, here introducing and set "high" flag when parsing
cmdline, when the first attempt failed:

if fixed_base:
     failed and return;

if set high:
     search_start = memblock_start_of_DRAM();
     search_end = (unsigned long)dma32_phys_limit;
else:
     search_start = (unsigned long)dma32_phys_limit;
     search_end = memblock_end_of_DRAM();

second attempt with new {search_start, search_end}
...

This should handle "crashkernel=,high" case correctly and avoid cross
4G reservation.

Is that logic correct, or is any other problem missed?

Thanks,
Jiahao

>
>>   		if (crash_base == 0) {
>>   			pr_warn("crashkernel: couldn't allocate %lldKB\n",
>>   				crash_size >> 10);
>>   			return;
>>   		}
>> +
>> +		if (!crash_low_size)
>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>> +	}
>> +
>> +	if ((crash_base > dma32_phys_limit - crash_low_size) &&
>> +	    crash_low_size && reserve_crashkernel_low(crash_low_size)) {
>> +		memblock_phys_free(crash_base, crash_size);
>> +		return;
>>   	}
>>   
>>   	pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
>> -- 
>> 2.31.1
>>
Re: [PATCH -next v5 1/2] riscv: kdump: Implement crashkernel=X,[high,low]
Posted by chenjiahao (C) 2 years, 7 months ago
On 2023/6/15 17:49, chenjiahao (C) wrote:
>
> On 2023/6/4 11:50, Baoquan He wrote:
>> Hi Jiahao,
>>
>> On 05/11/23 at 04:51pm, Chen Jiahao wrote:
>> ......
>>> @@ -1300,14 +1325,34 @@ static void __init reserve_crashkernel(void)
>>>           return;
>>>       }
>>>   -    ret = parse_crashkernel(boot_command_line, 
>>> memblock_phys_mem_size(),
>>> +    ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>>                   &crash_size, &crash_base);
>>> -    if (ret || !crash_size)
>>> +    if (ret == -ENOENT) {
>>> +        /* Fallback to crashkernel=X,[high,low] */
>>> +        ret = parse_crashkernel_high(cmdline, 0, &crash_size, 
>>> &crash_base);
>>> +        if (ret || !crash_size)
>>> +            return;
>>> +
>>> +        /*
>>> +         * crashkernel=Y,low is valid only when crashkernel=X,high
>>> +         * is passed.
>>> +         */
>>> +        ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, 
>>> &crash_base);
>>> +        if (ret == -ENOENT)
>>> +            crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>> +        else if (ret)
>>> +            return;
>>> +
>>> +        search_end = memblock_end_of_DRAM();
>>> +    } else if (ret || !crash_size) {
>>> +        /* Invalid argument value specified */
>>>           return;
>>> +    }
>> The parsing part looks great, while you didn't mark if it's specified
>> high reservation, please see later comment why it's needed.
>>
>>>         crash_size = PAGE_ALIGN(crash_size);
>>>         if (crash_base) {
>>> +        fixed_base = true;
>>>           search_start = crash_base;
>>>           search_end = crash_base + crash_size;
>>>       }
>>> @@ -1320,17 +1365,31 @@ static void __init reserve_crashkernel(void)
>>>        * swiotlb can work on the crash kernel.
>>>        */
>>>       crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
>>> -                           search_start,
>>> -                           min(search_end, (unsigned long) SZ_4G));
>>> +                           search_start, search_end);
>> If it's a specified high reservation, you have
>> search_start = memblock_start_of_DRAM();
>> search_end = memblock_end_of_DRAM();
>>
>> Then it attempts to search top down first time here.
>>
>>>       if (crash_base == 0) {
>>> -        /* Try again without restricting region to 32bit 
>>> addressible memory */
>>> +        if (fixed_base) {
>>> +            pr_warn("crashkernel: allocating failed with given 
>>> size@offset\n");
>>> +            return;
>>> +        }
>>> +        search_end = memblock_end_of_DRAM();
>>> +
>>> +        /* Try again above the region of 32bit addressible memory */
>>>           crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
>>> -                        search_start, search_end);
>>> +                               search_start, search_end);
>> If crashkernel=,high case, the first attempt failed, here it assigns
>> search_end with memblock_end_of_DRAM(). It's the exactly the same
>> attempt, why is that needed? Why don't you use a local variable 'high'
>> to mark the crashkernel=,hig, then judge when deciding how to adjsut the
>> reservation range.
>>
>> Do I misunderstand the code?
>>
>> Thanks
>> Baoquan
>
> You are right. Here I use search_end = memblock_end_of_DRAM() for the
> first attempt on "crashkernel=,high" case, but it will not distinct from
> other cases if the first attempt fails.
>
> I have read your latest refactor on Arm64, introducing the "high" flag
> is a good choice, the logic gets more straightforward when handling
> crashkernel=,high case and retrying.
>
> Following that logic, here introducing and set "high" flag when parsing
> cmdline, when the first attempt failed:
>
> if fixed_base:
>     failed and return;
>
> if set high:
>     search_start = memblock_start_of_DRAM();
>     search_end = (unsigned long)dma32_phys_limit;
> else:
>     search_start = (unsigned long)dma32_phys_limit;
>     search_end = memblock_end_of_DRAM();
>
> second attempt with new {search_start, search_end}
> ...
>
> This should handle "crashkernel=,high" case correctly and avoid cross
> 4G reservation.
>
> Is that logic correct, or is any other problem missed?
>
> Thanks,
> Jiahao

I have sent v6 patches, implementing the logic above. That fixes the 
retrying

logic and should be aligned with Arm64 code.


Please let me know if there is any problem remains.


Thanks,

Jiahao


>
>>
>>>           if (crash_base == 0) {
>>>               pr_warn("crashkernel: couldn't allocate %lldKB\n",
>>>                   crash_size >> 10);
>>>               return;
>>>           }
>>> +
>>> +        if (!crash_low_size)
>>> +            crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>> +    }
>>> +
>>> +    if ((crash_base > dma32_phys_limit - crash_low_size) &&
>>> +        crash_low_size && reserve_crashkernel_low(crash_low_size)) {
>>> +        memblock_phys_free(crash_base, crash_size);
>>> +        return;
>>>       }
>>>         pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld 
>>> MB)\n",
>>> -- 
>>> 2.31.1
>>>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH -next v5 1/2] riscv: kdump: Implement crashkernel=X,[high,low]
Posted by Baoquan He 2 years, 7 months ago
On 07/01/23 at 05:51pm, chenjiahao (C) wrote:
...... 
> I have sent v6 patches, implementing the logic above. That fixes the
> retrying
> 
> logic and should be aligned with Arm64 code.

Hmm, it has improved much, while there's still issue which need be
fixed. You missed the case that crsahkernel low is specified as zero
explicitly. Obviously your v6 is not able to handle that well. Means
your v6 is not aligned with the current arm64 code completely.

crashkernel=xM,high crashkernel=0M,low

> 
> 
> Please let me know if there is any problem remains.

Earlier, I posted below RFC patchset to try to unify the
crashkernel=,high support on x86, arm64 and risc-v, the generic arch.
Wondering what you think about it. risc-v can be added in with very few
change to get the crahskernel=,high support.

[RFC PATCH 0/4] kdump: add generic functions to simplify crashkernel crashkernel in architecture

Surely, the crashkernel=,high support can be added independently in
advance. Later my patchset can unify them and remove the duplicated code
in risc-v. It's up to you and risc-v maintainers/reivewers to take one.
Anyway, I will add comment to your v6 to point out the issue.

Thanks
Baoquan
Re: [PATCH -next v5 1/2] riscv: kdump: Implement crashkernel=X,[high,low]
Posted by chenjiahao (C) 2 years, 7 months ago
On 2023/7/2 12:06, Baoquan He wrote:
> On 07/01/23 at 05:51pm, chenjiahao (C) wrote:
> ......
>> I have sent v6 patches, implementing the logic above. That fixes the
>> retrying
>>
>> logic and should be aligned with Arm64 code.
> Hmm, it has improved much, while there's still issue which need be
> fixed. You missed the case that crsahkernel low is specified as zero
> explicitly. Obviously your v6 is not able to handle that well. Means
> your v6 is not aligned with the current arm64 code completely.
>
> crashkernel=xM,high crashkernel=0M,low
>
>>
>> Please let me know if there is any problem remains.
> Earlier, I posted below RFC patchset to try to unify the
> crashkernel=,high support on x86, arm64 and risc-v, the generic arch.
> Wondering what you think about it. risc-v can be added in with very few
> change to get the crahskernel=,high support.
>
> [RFC PATCH 0/4] kdump: add generic functions to simplify crashkernel crashkernel in architecture
>
> Surely, the crashkernel=,high support can be added independently in
> advance. Later my patchset can unify them and remove the duplicated code
> in risc-v. It's up to you and risc-v maintainers/reivewers to take one.
> Anyway, I will add comment to your v6 to point out the issue.

It would be great if crashkernel parsing and reserving logic could be
unified on multiple architectures, the code would be more straightforward
and easy to use. I will have a more in-depth review of your RFC patchset
later.

Meanwhile, I will continue to update my patchset on risc-v, just wishing
to complement this feature earlier. When your unify solution get applied,
simply remove the duplicate part is OK. Before that, I will update my
risc-v code and further align with the Arm64 logic.

Thanks for your carefully review, I will fix the issue above and send
v7 patchset soon.

Thanks,
Jiahao

>
> Thanks
> Baoquan
>