[PATCH -next] crash: fix x86_32 memory reserve dead loop retry bug at "high"

Jinjie Ruan posted 1 patch 1 year, 5 months ago
kernel/crash_reserve.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH -next] crash: fix x86_32 memory reserve dead loop retry bug at "high"
Posted by Jinjie Ruan 1 year, 5 months ago
Similar to commit 8f9dade5906a ("crash: fix x86_32 memory reserve dead loop
retry bug") and in the symmetry case, on x86_32 Qemu machine with
1GB memory, the cmdline "crashkernel=512M" will also cause system stall
as below:

	ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b]
	ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7]
	ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f]
	ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb]
	ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3]
	ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b]
	143MB HIGHMEM available.
	879MB LOWMEM available.
	  mapped low ram: 0 - 36ffe000
	  low ram: 0 - 36ffe000
	  (stall here)

The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX
on x86_32, the first "low" crash kernel memory reservation for 512M fails,
then it go into the "retry" loop and never came out as below (consider
CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX = 512M):

-> reserve_crashkernel_generic() and high is false
   -> alloc at [0, 0x20000000] fail
      -> alloc at [0x20000000, 0x20000000] fail and repeatedly
      (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).

Fix it by also changing the another out check condition, the fixed base
situation has no problem because it warn out if it fail to alloc.

After this patch, it prints:
	cannot allocate crashkernel (size:0x20000000)

Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 kernel/crash_reserve.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
index 03e455738e75..36c13cf942f4 100644
--- a/kernel/crash_reserve.c
+++ b/kernel/crash_reserve.c
@@ -409,7 +409,7 @@ void __init reserve_crashkernel_generic(char *cmdline,
 		 * low memory, fall back to high memory, the minimum required
 		 * low memory will be reserved later.
 		 */
-		if (!high && search_end == CRASH_ADDR_LOW_MAX) {
+		if (!high && !search_base) {
 			search_end = CRASH_ADDR_HIGH_MAX;
 			search_base = CRASH_ADDR_LOW_MAX;
 			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
-- 
2.34.1
Re: [PATCH -next] crash: fix x86_32 memory reserve dead loop retry bug at "high"
Posted by Baoquan He 1 year, 5 months ago
On 07/17/24 at 03:09pm, Jinjie Ruan wrote:
> Similar to commit 8f9dade5906a ("crash: fix x86_32 memory reserve dead loop
> retry bug") and in the symmetry case, on x86_32 Qemu machine with
> 1GB memory, the cmdline "crashkernel=512M" will also cause system stall
> as below:
> 
> 	ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b]
> 	ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7]
> 	ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f]
> 	ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb]
> 	ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3]
> 	ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b]
> 	143MB HIGHMEM available.
> 	879MB LOWMEM available.
> 	  mapped low ram: 0 - 36ffe000
> 	  low ram: 0 - 36ffe000
> 	  (stall here)
> 
> The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX
> on x86_32, the first "low" crash kernel memory reservation for 512M fails,
> then it go into the "retry" loop and never came out as below (consider
> CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX = 512M):
> 
> -> reserve_crashkernel_generic() and high is false
>    -> alloc at [0, 0x20000000] fail
>       -> alloc at [0x20000000, 0x20000000] fail and repeatedly
>       (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).
> 
> Fix it by also changing the another out check condition, the fixed base
> situation has no problem because it warn out if it fail to alloc.
> 
> After this patch, it prints:
> 	cannot allocate crashkernel (size:0x20000000)
> 
> Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code")
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
>  kernel/crash_reserve.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
> index 03e455738e75..36c13cf942f4 100644
> --- a/kernel/crash_reserve.c
> +++ b/kernel/crash_reserve.c
> @@ -409,7 +409,7 @@ void __init reserve_crashkernel_generic(char *cmdline,
>  		 * low memory, fall back to high memory, the minimum required
>  		 * low memory will be reserved later.
>  		 */
> -		if (!high && search_end == CRASH_ADDR_LOW_MAX) {
> +		if (!high && !search_base) {

Hmm, this may not be good. We can't guarantee that CRASH_ADDR_LOW_MAX must
not be 0. I still suggest you testing below draft patch to see if it works
well. And we should revert the patch in Andrew's tree since it's not good.
Posting like these mess will confuse people and add difficulty when
backporting.

You haven't responded to my earlier request to test those two draft
patches. When you tested below code and it's good, you can post this as
a formal patch. So my suggestion to the whole work is:

1) revert commit 8f9dade5906a in Andrew's tree;

2) post two patches I suggested to prevert crashkernel=,high for 32bit
system, and fix the issue you found;

3) post patchset to make arm32 use generic crashkernel reservation.

diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
index 5b2722a93a48..ac087ba442cd 100644
--- a/kernel/crash_reserve.c
+++ b/kernel/crash_reserve.c
@@ -414,7 +414,8 @@ void __init reserve_crashkernel_generic(char *cmdline,
 			search_end = CRASH_ADDR_HIGH_MAX;
 			search_base = CRASH_ADDR_LOW_MAX;
 			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
-			goto retry;
+			if (search_base != search_end)
+				goto retry;
 		}
 
 		/*
Re: [PATCH -next] crash: fix x86_32 memory reserve dead loop retry bug at "high"
Posted by Jinjie Ruan 1 year, 5 months ago

On 2024/7/17 21:38, Baoquan He wrote:
> On 07/17/24 at 03:09pm, Jinjie Ruan wrote:
>> Similar to commit 8f9dade5906a ("crash: fix x86_32 memory reserve dead loop
>> retry bug") and in the symmetry case, on x86_32 Qemu machine with
>> 1GB memory, the cmdline "crashkernel=512M" will also cause system stall
>> as below:
>>
>> 	ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b]
>> 	ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7]
>> 	ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f]
>> 	ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb]
>> 	ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3]
>> 	ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b]
>> 	143MB HIGHMEM available.
>> 	879MB LOWMEM available.
>> 	  mapped low ram: 0 - 36ffe000
>> 	  low ram: 0 - 36ffe000
>> 	  (stall here)
>>
>> The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX
>> on x86_32, the first "low" crash kernel memory reservation for 512M fails,
>> then it go into the "retry" loop and never came out as below (consider
>> CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX = 512M):
>>
>> -> reserve_crashkernel_generic() and high is false
>>    -> alloc at [0, 0x20000000] fail
>>       -> alloc at [0x20000000, 0x20000000] fail and repeatedly
>>       (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).
>>
>> Fix it by also changing the another out check condition, the fixed base
>> situation has no problem because it warn out if it fail to alloc.
>>
>> After this patch, it prints:
>> 	cannot allocate crashkernel (size:0x20000000)
>>
>> Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code")
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>>  kernel/crash_reserve.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
>> index 03e455738e75..36c13cf942f4 100644
>> --- a/kernel/crash_reserve.c
>> +++ b/kernel/crash_reserve.c
>> @@ -409,7 +409,7 @@ void __init reserve_crashkernel_generic(char *cmdline,
>>  		 * low memory, fall back to high memory, the minimum required
>>  		 * low memory will be reserved later.
>>  		 */
>> -		if (!high && search_end == CRASH_ADDR_LOW_MAX) {
>> +		if (!high && !search_base) {
> 
> Hmm, this may not be good. We can't guarantee that CRASH_ADDR_LOW_MAX must
> not be 0. I still suggest you testing below draft patch to see if it works
> well. And we should revert the patch in Andrew's tree since it's not good.
> Posting like these mess will confuse people and add difficulty when
> backporting.

OK,let me get this straight and test your draft patches, if it is ok,
I'll send them sooner.

> 
> You haven't responded to my earlier request to test those two draft
> patches. When you tested below code and it's good, you can post this as
> a formal patch. So my suggestion to the whole work is:
> 
> 1) revert commit 8f9dade5906a in Andrew's tree;
> 
> 2) post two patches I suggested to prevert crashkernel=,high for 32bit
> system, and fix the issue you found;
> 
> 3) post patchset to make arm32 use generic crashkernel reservation.

Sorry, I didn't quite understand your real intentions before. Thanks for
the suggestion. I'll retest and submit the patch based on your suggestion.

> 
> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
> index 5b2722a93a48..ac087ba442cd 100644
> --- a/kernel/crash_reserve.c
> +++ b/kernel/crash_reserve.c
> @@ -414,7 +414,8 @@ void __init reserve_crashkernel_generic(char *cmdline,
>  			search_end = CRASH_ADDR_HIGH_MAX;
>  			search_base = CRASH_ADDR_LOW_MAX;
>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> -			goto retry;
> +			if (search_base != search_end)
> +				goto retry;
>  		}
>  
>  		/*
> 
> 
Re: [PATCH -next] crash: fix x86_32 memory reserve dead loop retry bug at "high"
Posted by Andrew Morton 1 year, 5 months ago
On Wed, 17 Jul 2024 21:38:41 +0800 Baoquan He <bhe@redhat.com> wrote:

> 1) revert commit 8f9dade5906a in Andrew's tree;

Thanks, dropped.
Re: [PATCH -next] crash: fix x86_32 memory reserve dead loop retry bug at "high"
Posted by Baoquan He 1 year, 5 months ago
On 07/17/24 at 12:49pm, Andrew Morton wrote:
> On Wed, 17 Jul 2024 21:38:41 +0800 Baoquan He <bhe@redhat.com> wrote:
> 
> > 1) revert commit 8f9dade5906a in Andrew's tree;

Thanks, Andrew.