[PATCH] hw: riscv: Allow large kernels to boot by moving the initrd further way in RAM

Alexandre Ghiti posted 1 patch 9 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20240205070040.367541-1-alexghiti@rivosinc.com
Maintainers: Palmer Dabbelt <palmer@dabbelt.com>, Alistair Francis <alistair.francis@wdc.com>, Bin Meng <bin.meng@windriver.com>, Weiwei Li <liwei1518@gmail.com>, Daniel Henrique Barboza <dbarboza@ventanamicro.com>, Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
hw/riscv/boot.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
[PATCH] hw: riscv: Allow large kernels to boot by moving the initrd further way in RAM
Posted by Alexandre Ghiti 9 months, 3 weeks ago
Currently, the initrd is placed at 128MB, which overlaps with the kernel
when it is large (for example syzbot kernels are). From the kernel side,
there is no reason we could not push the initrd further away in memory
to accomodate large kernels, so move the initrd at 512MB when possible.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
---
 hw/riscv/boot.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
index 0ffca05189..9a367af2fa 100644
--- a/hw/riscv/boot.c
+++ b/hw/riscv/boot.c
@@ -188,13 +188,13 @@ static void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
      * kernel is uncompressed it will not clobber the initrd. However
      * on boards without much RAM we must ensure that we still leave
      * enough room for a decent sized initrd, and on boards with large
-     * amounts of RAM we must avoid the initrd being so far up in RAM
-     * that it is outside lowmem and inaccessible to the kernel.
-     * So for boards with less  than 256MB of RAM we put the initrd
-     * halfway into RAM, and for boards with 256MB of RAM or more we put
-     * the initrd at 128MB.
+     * amounts of RAM, we put the initrd at 512MB to allow large kernels
+     * to boot.
+     * So for boards with less than 1GB of RAM we put the initrd
+     * halfway into RAM, and for boards with 1GB of RAM or more we put
+     * the initrd at 512MB.
      */
-    start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
+    start = kernel_entry + MIN(mem_size / 2, 512 * MiB);
 
     size = load_ramdisk(filename, start, mem_size - start);
     if (size == -1) {
-- 
2.39.2
Re: [PATCH] hw: riscv: Allow large kernels to boot by moving the initrd further way in RAM
Posted by Daniel Henrique Barboza 9 months, 3 weeks ago

On 2/5/24 04:00, Alexandre Ghiti wrote:
> Currently, the initrd is placed at 128MB, which overlaps with the kernel
> when it is large (for example syzbot kernels are). From the kernel side,
> there is no reason we could not push the initrd further away in memory
> to accomodate large kernels, so move the initrd at 512MB when possible.

typo: accommodate

> 
> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> ---

Patch looks good - just tested with an Ubuntu guest and nothing bad happened.

But I wonder ... what if there's an even bigger kernel we have to deal with?
Move initrd even further away to fit it in?

Instead of making assumptions about where initrd starts, we could grab the kernel
size loaded in the board and use it as a reference. This would be done by storing
the return of load_elf_ram_sym/load_uimage_as/load_image_targphys_as from
riscv_load_kernel() an passing as argument to riscv_load_initrd().

initrd start would then be:

     start = kernel_entry + MIN(mem_size / 2, kernel_size);

However, I believe we would like to keep the existing 128Mb minimum initrd start,
even if the kernel is smaller than 128Mb, to avoid breaking existing configs that
might be making this assumption. initrd start would then become:


     start = kernel_entry + MIN(mem_size / 2, MAX(kernel_size, 128 * MiB));



Thanks,



Daniel


>   hw/riscv/boot.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 0ffca05189..9a367af2fa 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -188,13 +188,13 @@ static void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
>        * kernel is uncompressed it will not clobber the initrd. However
>        * on boards without much RAM we must ensure that we still leave
>        * enough room for a decent sized initrd, and on boards with large
> -     * amounts of RAM we must avoid the initrd being so far up in RAM
> -     * that it is outside lowmem and inaccessible to the kernel.
> -     * So for boards with less  than 256MB of RAM we put the initrd
> -     * halfway into RAM, and for boards with 256MB of RAM or more we put
> -     * the initrd at 128MB.
> +     * amounts of RAM, we put the initrd at 512MB to allow large kernels
> +     * to boot.
> +     * So for boards with less than 1GB of RAM we put the initrd
> +     * halfway into RAM, and for boards with 1GB of RAM or more we put
> +     * the initrd at 512MB.
>        */
> -    start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
> +    start = kernel_entry + MIN(mem_size / 2, 512 * MiB);
>   
>       size = load_ramdisk(filename, start, mem_size - start);
>       if (size == -1) {
Re: [PATCH] hw: riscv: Allow large kernels to boot by moving the initrd further way in RAM
Posted by Alexandre Ghiti 9 months, 3 weeks ago
Hi Daniel,

On Mon, Feb 5, 2024 at 1:17 PM Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 2/5/24 04:00, Alexandre Ghiti wrote:
> > Currently, the initrd is placed at 128MB, which overlaps with the kernel
> > when it is large (for example syzbot kernels are). From the kernel side,
> > there is no reason we could not push the initrd further away in memory
> > to accomodate large kernels, so move the initrd at 512MB when possible.
>
> typo: accommodate
>
> >
> > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > ---
>
> Patch looks good - just tested with an Ubuntu guest and nothing bad happened.
>
> But I wonder ... what if there's an even bigger kernel we have to deal with?
> Move initrd even further away to fit it in?
>
> Instead of making assumptions about where initrd starts, we could grab the kernel
> size loaded in the board and use it as a reference. This would be done by storing
> the return of load_elf_ram_sym/load_uimage_as/load_image_targphys_as from
> riscv_load_kernel() an passing as argument to riscv_load_initrd().
>
> initrd start would then be:
>
>      start = kernel_entry + MIN(mem_size / 2, kernel_size);
>
> However, I believe we would like to keep the existing 128Mb minimum initrd start,
> even if the kernel is smaller than 128Mb, to avoid breaking existing configs that
> might be making this assumption. initrd start would then become:
>
>
>      start = kernel_entry + MIN(mem_size / 2, MAX(kernel_size, 128 * MiB));

Great, I agree with you, thanks for the pointers. I'll just align the
size on a 2MB boundary to make sure the kernel mapping (which in the
case of Linux uses PMD) does not overlap with the initrd.

I'll get back soon with a v2.

Thanks again,

Alex

>
>
>
> Thanks,
>
>
>
> Daniel
>
>
> >   hw/riscv/boot.c | 12 ++++++------
> >   1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> > index 0ffca05189..9a367af2fa 100644
> > --- a/hw/riscv/boot.c
> > +++ b/hw/riscv/boot.c
> > @@ -188,13 +188,13 @@ static void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
> >        * kernel is uncompressed it will not clobber the initrd. However
> >        * on boards without much RAM we must ensure that we still leave
> >        * enough room for a decent sized initrd, and on boards with large
> > -     * amounts of RAM we must avoid the initrd being so far up in RAM
> > -     * that it is outside lowmem and inaccessible to the kernel.
> > -     * So for boards with less  than 256MB of RAM we put the initrd
> > -     * halfway into RAM, and for boards with 256MB of RAM or more we put
> > -     * the initrd at 128MB.
> > +     * amounts of RAM, we put the initrd at 512MB to allow large kernels
> > +     * to boot.
> > +     * So for boards with less than 1GB of RAM we put the initrd
> > +     * halfway into RAM, and for boards with 1GB of RAM or more we put
> > +     * the initrd at 512MB.
> >        */
> > -    start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
> > +    start = kernel_entry + MIN(mem_size / 2, 512 * MiB);
> >
> >       size = load_ramdisk(filename, start, mem_size - start);
> >       if (size == -1) {
Re: [PATCH] hw: riscv: Allow large kernels to boot by moving the initrd further way in RAM
Posted by Alexandre Ghiti 9 months, 3 weeks ago
Hi Daniel,

On Mon, Feb 5, 2024 at 2:36 PM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
>
> Hi Daniel,
>
> On Mon, Feb 5, 2024 at 1:17 PM Daniel Henrique Barboza
> <dbarboza@ventanamicro.com> wrote:
> >
> >
> >
> > On 2/5/24 04:00, Alexandre Ghiti wrote:
> > > Currently, the initrd is placed at 128MB, which overlaps with the kernel
> > > when it is large (for example syzbot kernels are). From the kernel side,
> > > there is no reason we could not push the initrd further away in memory
> > > to accomodate large kernels, so move the initrd at 512MB when possible.
> >
> > typo: accommodate
> >
> > >
> > > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > > ---
> >
> > Patch looks good - just tested with an Ubuntu guest and nothing bad happened.
> >
> > But I wonder ... what if there's an even bigger kernel we have to deal with?
> > Move initrd even further away to fit it in?
> >
> > Instead of making assumptions about where initrd starts, we could grab the kernel
> > size loaded in the board and use it as a reference. This would be done by storing
> > the return of load_elf_ram_sym/load_uimage_as/load_image_targphys_as from
> > riscv_load_kernel() an passing as argument to riscv_load_initrd().

So this does not work because the size returned by
load_image_targphys_as() does not take into account the size of the
BSS sections (and I guess other NOBITS sections) and then we end up
loading the initrd there. I'm a bit surprised though because arm64
does just that https://elixir.bootlin.com/qemu/latest/source/hw/arm/boot.c#L1034.
I also tried using the highaddr parameter, but that gives the same
result. We could parse the Image header (a PE header) to get the
"Virtual Size" of the .data section, but I did not find any PE header
parser in qemu, so that would be overkill?

Any idea?

Thanks,

Alex

> >
> > initrd start would then be:
> >
> >      start = kernel_entry + MIN(mem_size / 2, kernel_size);
> >
> > However, I believe we would like to keep the existing 128Mb minimum initrd start,
> > even if the kernel is smaller than 128Mb, to avoid breaking existing configs that
> > might be making this assumption. initrd start would then become:
> >
> >
> >      start = kernel_entry + MIN(mem_size / 2, MAX(kernel_size, 128 * MiB));
>
> Great, I agree with you, thanks for the pointers. I'll just align the
> size on a 2MB boundary to make sure the kernel mapping (which in the
> case of Linux uses PMD) does not overlap with the initrd.
>
> I'll get back soon with a v2.
>
> Thanks again,
>
> Alex
>
> >
> >
> >
> > Thanks,
> >
> >
> >
> > Daniel
> >
> >
> > >   hw/riscv/boot.c | 12 ++++++------
> > >   1 file changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> > > index 0ffca05189..9a367af2fa 100644
> > > --- a/hw/riscv/boot.c
> > > +++ b/hw/riscv/boot.c
> > > @@ -188,13 +188,13 @@ static void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
> > >        * kernel is uncompressed it will not clobber the initrd. However
> > >        * on boards without much RAM we must ensure that we still leave
> > >        * enough room for a decent sized initrd, and on boards with large
> > > -     * amounts of RAM we must avoid the initrd being so far up in RAM
> > > -     * that it is outside lowmem and inaccessible to the kernel.
> > > -     * So for boards with less  than 256MB of RAM we put the initrd
> > > -     * halfway into RAM, and for boards with 256MB of RAM or more we put
> > > -     * the initrd at 128MB.
> > > +     * amounts of RAM, we put the initrd at 512MB to allow large kernels
> > > +     * to boot.
> > > +     * So for boards with less than 1GB of RAM we put the initrd
> > > +     * halfway into RAM, and for boards with 1GB of RAM or more we put
> > > +     * the initrd at 512MB.
> > >        */
> > > -    start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
> > > +    start = kernel_entry + MIN(mem_size / 2, 512 * MiB);
> > >
> > >       size = load_ramdisk(filename, start, mem_size - start);
> > >       if (size == -1) {
Re: [PATCH] hw: riscv: Allow large kernels to boot by moving the initrd further way in RAM
Posted by Daniel Henrique Barboza 9 months, 3 weeks ago

On 2/6/24 06:41, Alexandre Ghiti wrote:
> Hi Daniel,
> 
> On Mon, Feb 5, 2024 at 2:36 PM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
>>
>> Hi Daniel,
>>
>> On Mon, Feb 5, 2024 at 1:17 PM Daniel Henrique Barboza
>> <dbarboza@ventanamicro.com> wrote:
>>>
>>>
>>>
>>> On 2/5/24 04:00, Alexandre Ghiti wrote:
>>>> Currently, the initrd is placed at 128MB, which overlaps with the kernel
>>>> when it is large (for example syzbot kernels are). From the kernel side,
>>>> there is no reason we could not push the initrd further away in memory
>>>> to accomodate large kernels, so move the initrd at 512MB when possible.
>>>
>>> typo: accommodate
>>>
>>>>
>>>> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
>>>> ---
>>>
>>> Patch looks good - just tested with an Ubuntu guest and nothing bad happened.
>>>
>>> But I wonder ... what if there's an even bigger kernel we have to deal with?
>>> Move initrd even further away to fit it in?
>>>
>>> Instead of making assumptions about where initrd starts, we could grab the kernel
>>> size loaded in the board and use it as a reference. This would be done by storing
>>> the return of load_elf_ram_sym/load_uimage_as/load_image_targphys_as from
>>> riscv_load_kernel() an passing as argument to riscv_load_initrd().
> 
> So this does not work because the size returned by
> load_image_targphys_as() does not take into account the size of the
> BSS sections (and I guess other NOBITS sections) and then we end up
> loading the initrd there. I'm a bit surprised though because arm64
> does just that https://elixir.bootlin.com/qemu/latest/source/hw/arm/boot.c#L1034.
> I also tried using the highaddr parameter, but that gives the same
> result. We could parse the Image header (a PE header) to get the
> "Virtual Size" of the .data section, but I did not find any PE header
> parser in qemu, so that would be overkill?
> 
> Any idea?


hmm that's unfortunate. The size retrieval in these cases might have to do with
missing information from the image header, and we can't control that because we're
not generating the images. It would be better if they return '0' in these cases
(then we could special case them in the math), but if it's a non-zero return then
there's not much we can do.

If we can't reliably get the kernel size in RAM then I believe your patch is going
to make it do (until we want to load an even bigger kernel hehe). Can you please
document these findings (i.e. we can't retrieve the kernel size reliably) in the
commit msg?


Thanks,

Daniel

> 
> Thanks,
> 
> Alex
> 
>>>
>>> initrd start would then be:
>>>
>>>       start = kernel_entry + MIN(mem_size / 2, kernel_size);
>>>
>>> However, I believe we would like to keep the existing 128Mb minimum initrd start,
>>> even if the kernel is smaller than 128Mb, to avoid breaking existing configs that
>>> might be making this assumption. initrd start would then become:
>>>
>>>
>>>       start = kernel_entry + MIN(mem_size / 2, MAX(kernel_size, 128 * MiB));
>>
>> Great, I agree with you, thanks for the pointers. I'll just align the
>> size on a 2MB boundary to make sure the kernel mapping (which in the
>> case of Linux uses PMD) does not overlap with the initrd.
>>
>> I'll get back soon with a v2.
>>
>> Thanks again,
>>
>> Alex
>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Daniel
>>>
>>>
>>>>    hw/riscv/boot.c | 12 ++++++------
>>>>    1 file changed, 6 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
>>>> index 0ffca05189..9a367af2fa 100644
>>>> --- a/hw/riscv/boot.c
>>>> +++ b/hw/riscv/boot.c
>>>> @@ -188,13 +188,13 @@ static void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
>>>>         * kernel is uncompressed it will not clobber the initrd. However
>>>>         * on boards without much RAM we must ensure that we still leave
>>>>         * enough room for a decent sized initrd, and on boards with large
>>>> -     * amounts of RAM we must avoid the initrd being so far up in RAM
>>>> -     * that it is outside lowmem and inaccessible to the kernel.
>>>> -     * So for boards with less  than 256MB of RAM we put the initrd
>>>> -     * halfway into RAM, and for boards with 256MB of RAM or more we put
>>>> -     * the initrd at 128MB.
>>>> +     * amounts of RAM, we put the initrd at 512MB to allow large kernels
>>>> +     * to boot.
>>>> +     * So for boards with less than 1GB of RAM we put the initrd
>>>> +     * halfway into RAM, and for boards with 1GB of RAM or more we put
>>>> +     * the initrd at 512MB.
>>>>         */
>>>> -    start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
>>>> +    start = kernel_entry + MIN(mem_size / 2, 512 * MiB);
>>>>
>>>>        size = load_ramdisk(filename, start, mem_size - start);
>>>>        if (size == -1) {