[RFC] memmap: introduce cmdline parameter "memmap=nn[KMG]$" without start addr

lizhe.67@bytedance.com posted 1 patch 1 year, 10 months ago
There is a newer version of this series
.../admin-guide/kernel-parameters.txt         |  7 ++
arch/x86/kernel/e820.c                        | 64 ++++++++++++++++++-
2 files changed, 69 insertions(+), 2 deletions(-)
[RFC] memmap: introduce cmdline parameter "memmap=nn[KMG]$" without start addr
Posted by lizhe.67@bytedance.com 1 year, 10 months ago
From: Li Zhe <lizhe.67@bytedance.com>

In current kernel we can use memmap=nn[KMG]$ss[KMG] to reserve an
area of memory for another use from kernel. We have to determine
it's start addr and length. In our scenario, we need reserve or
alloc large continous memory like 256M in machine which have
different memory specification at just boot phase for a user land
process. And these memorys will not be freed to system before
system reboot. It is a hard work for us to reserve memory with
same length from machine with different memory specification,
because we have to determine the start addr of the reserved memory
for each type of machine.

This patch introduce a cmdline parameter "memmap=nn[KMG]$" to make
this work easy. It is an extension of "memmap=nn[KMG]$ss[KMG]". We
don't need to input the start addr. Kernel will reserve a suitable
area of memory and we can get the area from /proc/iomem with the
key word "Memmap Alloc". Notice that we need "$" in our cmdline
parameter or it will be confused with memmap=nn[KMG]@ss[KMG].

Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 .../admin-guide/kernel-parameters.txt         |  7 ++
 arch/x86/kernel/e820.c                        | 64 ++++++++++++++++++-
 2 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 2522b11e593f..b88df1e61d48 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3022,6 +3022,13 @@
 			         memmap=64K$0x18690000
 			         or
 			         memmap=0x10000$0x18690000
+			[KNL, X86] If @ss[KMG] is omitted, kernel will reserve a
+			suitable area of memory for us. We can find the area from
+			/proc/iomem with key word "Memmap Alloc".
+			Example: Exclude memory with size 0x10000
+					 memmap=64K$
+					 or
+					 memmap=0x10000$
 			Some bootloaders may need an escape character before '$',
 			like Grub2, otherwise '$' and the following number
 			will be eaten.
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index f267205f2d5a..241d41ec870f 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -942,8 +942,18 @@ static int __init parse_memmap_one(char *p)
 		start_at = memparse(p+1, &p);
 		e820__range_add(start_at, mem_size, E820_TYPE_ACPI);
 	} else if (*p == '$') {
-		start_at = memparse(p+1, &p);
-		e820__range_add(start_at, mem_size, E820_TYPE_RESERVED);
+		if (*(p+1) == '\0') {
+			/*
+			 * In the case we just want to reserve memory with size
+			 * 'mem_size' and don't care where it start, we get '\0'
+			 * here.
+			 */
+			p++;
+		} else {
+			/* We determine the start and size of the reserved memory */
+			start_at = memparse(p+1, &p);
+			e820__range_add(start_at, mem_size, E820_TYPE_RESERVED);
+		}
 	} else if (*p == '!') {
 		start_at = memparse(p+1, &p);
 		e820__range_add(start_at, mem_size, E820_TYPE_PRAM);
@@ -972,6 +982,40 @@ static int __init parse_memmap_one(char *p)
 	return *p == '\0' ? 0 : -EINVAL;
 }
 
+static int __init setup_memmap_random(char *p)
+{
+	char *oldp;
+	struct resource *res;
+	u64 start_at, mem_size;
+
+	if (!p)
+		return -EINVAL;
+	oldp = p;
+	mem_size = memparse(p, &p);
+	if (p == oldp)
+		return -EINVAL;
+
+	if (*p == '$') {
+		if (*(p+1) != '\0')
+			return 0; /* no need to deal with */
+		start_at = memblock_phys_alloc(mem_size, SMP_CACHE_BYTES);
+		if (start_at == 0)
+			return -ENOMEM;
+		res = memblock_alloc(sizeof(struct resource), SMP_CACHE_BYTES);
+		if (res == NULL) {
+			memblock_phys_free(start_at, mem_size);
+			return -ENOMEM;
+		}
+		res->start = start_at;
+		res->end = start_at + mem_size - 1;
+		res->name = "Memmap Alloc";
+		res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+		res->desc = IORES_DESC_RESERVED;
+		insert_resource(&iomem_resource, res);
+	}
+	return 0;
+}
+
 static int __init parse_memmap_opt(char *str)
 {
 	while (str) {
@@ -988,6 +1032,22 @@ static int __init parse_memmap_opt(char *str)
 }
 early_param("memmap", parse_memmap_opt);
 
+static int __init setup_memmap_opt(char *str)
+{
+	while (str) {
+		char *k = strchr(str, ',');
+
+		if (k)
+			*k++ = 0;
+
+		setup_memmap_random(str);
+		str = k;
+	}
+
+	return 0;
+}
+__setup("memmap=", setup_memmap_opt);
+
 /*
  * Reserve all entries from the bootloader's extensible data nodes list,
  * because if present we are going to use it later on to fetch e820
-- 
2.20.1
Re: [RFC] memmap: introduce cmdline parameter "memmap=nn[KMG]$" without start addr
Posted by Dave Hansen 1 year, 10 months ago
On 6/22/22 23:24, lizhe.67@bytedance.com wrote:
> In our scenario, we need reserve or alloc large continous memory like
> 256M in machine which have different memory specification at just
> boot phase for a user land process.

Just marking the memory reserved doesn't do any good by itself.  There
must be some *other* kernel code to find this reserved area and make it
available to userspace.

It seems kinda silly to add this to the kernel without also adding the
other half of the solution.  Plus, we don't really even know what this
is for.  Are there other, better solutions?  I certainly can't offer any
because this changelog did not provide a full picture of the problem
this solves.
Re: [RFC] memmap: introduce cmdline parameter "memmap=nn[KMG]$" without start addr
Posted by lizhe.67@bytedance.com 1 year, 9 months ago
 On Thu, 23 Jun 2022 07:06:36, dave.hansen@intel.com wrote:
>> In our scenario, we need reserve or alloc large continous memory like
>> 256M in machine which have different memory specification at just
>> boot phase for a user land process.
>
>Just marking the memory reserved doesn't do any good by itself.  There
>must be some *other* kernel code to find this reserved area and make it
>available to userspace.
>

Sorry for not describing clearly. We wanted to use /dev/mem as our
interface to access the memory from userspace. So we don't add
other kernel code.

>It seems kinda silly to add this to the kernel without also adding the
>other half of the solution.  Plus, we don't really even know what this
>is for.  Are there other, better solutions?  I certainly can't offer any
>because this changelog did not provide a full picture of the problem
>this solves.

Again, sorry for not describing clearly. Here is our scenario. We need
to reserve large continous memory at least 256M in 512G's machine, and
need reserve more memory in larger machine. A userspace program will use
it through /dev/mem to store some data. Besides, a hardware will need
the data stored by the userspace program to do it's job. Why we need
continous physical memory is that our hardware can only access memory
without mmu. So allocing an area of large continous memory for userspace
is the best way for us. Considering that we have several type of machine
with different memory specification, so we want an easy way to reserve
memory with only one size parameter.

I find a better way to realize the requirement. I will send a v2 patch
soon.
Re: [RFC] memmap: introduce cmdline parameter "memmap=nn[KMG]$" without start addr
Posted by H. Peter Anvin 1 year, 10 months ago
On June 23, 2022 7:06:36 AM PDT, Dave Hansen <dave.hansen@intel.com> wrote:
>On 6/22/22 23:24, lizhe.67@bytedance.com wrote:
>> In our scenario, we need reserve or alloc large continous memory like
>> 256M in machine which have different memory specification at just
>> boot phase for a user land process.
>
>Just marking the memory reserved doesn't do any good by itself.  There
>must be some *other* kernel code to find this reserved area and make it
>available to userspace.
>
>It seems kinda silly to add this to the kernel without also adding the
>other half of the solution.  Plus, we don't really even know what this
>is for.  Are there other, better solutions?  I certainly can't offer any
>because this changelog did not provide a full picture of the problem
>this solves.

Don't we already have a large contiguous physical memory allocator for this reason (misdesigned hardware?)
Re: [RFC] memmap: introduce cmdline parameter "memmap=nn[KMG]$" without start addr
Posted by lizhe.67@bytedance.com 1 year, 9 months ago
On Thu, 23 Jun 2022 11:22:52, H. Peter Anvin <hpa@zytor.com> wrote:
>>On 6/22/22 23:24, lizhe.67@bytedance.com wrote:
>>> In our scenario, we need reserve or alloc large continous memory like
>>> 256M in machine which have different memory specification at just
>>> boot phase for a user land process.
>>
>>Just marking the memory reserved doesn't do any good by itself.  There
>>must be some *other* kernel code to find this reserved area and make it
>>available to userspace.
>>
>>It seems kinda silly to add this to the kernel without also adding the
>>other half of the solution.  Plus, we don't really even know what this
>>is for.  Are there other, better solutions?  I certainly can't offer any
>>because this changelog did not provide a full picture of the problem
>>this solves.
>
>Don't we already have a large contiguous physical memory allocator for this reason (misdesigned hardware?)

Yes we have already considered using CMA to realize the requirement. But CMA
only provides several kernel space interface for memory allocation. It seems
that userspace do not have a way to access those memory at current kernel.
In our scenario, we need to reserve large continuous physical memory for a
userspace program. It stores some data into memory and a hardware will consume
them. So allocing an area of large continuous memory for userspace program is
the best way for us.