[PATCH 0/1] hw/arm/aspeed: Add fby35 machine type

Peter Delevoryas posted 1 patch 1 year, 12 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220503204451.1257898-1-pdel@fb.com
Maintainers: "Cédric Le Goater" <clg@kaod.org>, Peter Maydell <peter.maydell@linaro.org>, Andrew Jeffery <andrew@aj.id.au>, Joel Stanley <joel@jms.id.au>, Cleber Rosa <crosa@redhat.com>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Beraldo Leal <bleal@redhat.com>
There is a newer version of this series
hw/arm/aspeed.c                     | 63 +++++++++++++++++++++++++++++
tests/avocado/boot_linux_console.py | 20 ++++++++-
2 files changed, 82 insertions(+), 1 deletion(-)
[PATCH 0/1] hw/arm/aspeed: Add fby35 machine type
Posted by Peter Delevoryas 1 year, 12 months ago
Hey everyone,

I'm submitting another Facebook (Meta Platforms) machine type: this time I'm
including an acceptance test too.

Unfortunately, this machine boots _very_ slowly. 300+ seconds. I'm not sure why
this is (so I don't know how to fix it easily) and I don't know how to override
the avocado test timeout just for a single test, so I increased the global
timeout for all "boot_linux_console.py" tests from 90s to 400s. I doubt this is
acceptable, but what other option is there? Should I add
AVOCADO_TIMEOUT_EXPECTED?

@skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')

What is the point of this environment variable though, except to skip it in CIT?
If I run the test with this environment variable defined, it doesn't disable the
timeout, it just skips it right? I want an option to run this test with a larger
timeout.

Thanks,
Peter

Peter Delevoryas (1):
  hw/arm/aspeed: Add fby35 machine type

 hw/arm/aspeed.c                     | 63 +++++++++++++++++++++++++++++
 tests/avocado/boot_linux_console.py | 20 ++++++++-
 2 files changed, 82 insertions(+), 1 deletion(-)

-- 
2.30.2
Re: [PATCH 0/1] hw/arm/aspeed: Add fby35 machine type
Posted by Cédric Le Goater 1 year, 12 months ago
On 5/3/22 22:44, Peter Delevoryas wrote:
> Hey everyone,
> 
> I'm submitting another Facebook (Meta Platforms) machine type: this time I'm
> including an acceptance test too.
> 
> Unfortunately, this machine boots _very_ slowly. 300+ seconds. 

This is too much for avocado tests.

> I'm not sure why this is (so I don't know how to fix it easily)

The fuji has the same kind of problem. It takes ages to load the lzma ramdisk.
Could it be a modeling issue ? or how the FW image is compiled ?

Thanks,

C.


> and I don't know how to override
> the avocado test timeout just for a single test, so I increased the global
> timeout for all "boot_linux_console.py" tests from 90s to 400s. I doubt this is
> acceptable, but what other option is there? Should I add
> AVOCADO_TIMEOUT_EXPECTED?
> 
> @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
> 
> What is the point of this environment variable though, except to skip it in CIT?
> If I run the test with this environment variable defined, it doesn't disable the
> timeout, it just skips it right? I want an option to run this test with a larger
> timeout.
> 
> Thanks,
> Peter
> 
> Peter Delevoryas (1):
>    hw/arm/aspeed: Add fby35 machine type
> 
>   hw/arm/aspeed.c                     | 63 +++++++++++++++++++++++++++++
>   tests/avocado/boot_linux_console.py | 20 ++++++++-
>   2 files changed, 82 insertions(+), 1 deletion(-)
>
Re: [PATCH 0/1] hw/arm/aspeed: Add fby35 machine type
Posted by Peter Delevoryas 1 year, 12 months ago

> On May 3, 2022, at 2:35 PM, Cédric Le Goater <clg@kaod.org> wrote:
> 
> On 5/3/22 22:44, Peter Delevoryas wrote:
>> Hey everyone,
>> I'm submitting another Facebook (Meta Platforms) machine type: this time I'm
>> including an acceptance test too.
>> Unfortunately, this machine boots _very_ slowly. 300+ seconds. 
> 
> This is too much for avocado tests.

Erg, yeah I figured as much. I’ll just resubmit it without the avocado test then,
if that sounds ok to you.

> 
>> I'm not sure why this is (so I don't know how to fix it easily)
> 
> The fuji has the same kind of problem. It takes ages to load the lzma ramdisk.
> Could it be a modeling issue ? or how the FW image is compiled ?

Yeah, one reason is that Facebook OpenBMC machines have an unnecessarily
big initramfs that includes all the rootfs stuff, whereas regular OpenBMC
machines have a smaller initramfs right? I don’t entirely know what I’m talking
about though.

I think most FB machines have moved to zstd compression recently though,
but this one may have been missed: I can fix that on the image side. I’ll
also experiment more to see if it’s something wrong with the image, or possibly
a regression in QEMU. It would really be super awesome if it could boot faster,
so I’m very motivated to find a solution.

> 
> Thanks,
> 
> C.
> 
> 
>> and I don't know how to override
>> the avocado test timeout just for a single test, so I increased the global
>> timeout for all "boot_linux_console.py" tests from 90s to 400s. I doubt this is
>> acceptable, but what other option is there? Should I add
>> AVOCADO_TIMEOUT_EXPECTED?
>> @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
>> What is the point of this environment variable though, except to skip it in CIT?
>> If I run the test with this environment variable defined, it doesn't disable the
>> timeout, it just skips it right? I want an option to run this test with a larger
>> timeout.
>> Thanks,
>> Peter
>> Peter Delevoryas (1):
>>   hw/arm/aspeed: Add fby35 machine type
>>  hw/arm/aspeed.c                     | 63 +++++++++++++++++++++++++++++
>>  tests/avocado/boot_linux_console.py | 20 ++++++++-
>>  2 files changed, 82 insertions(+), 1 deletion(-)
> 

Re: [PATCH 0/1] hw/arm/aspeed: Add fby35 machine type
Posted by Philippe Mathieu-Daudé via 1 year, 11 months ago
On 4/5/22 00:47, Peter Delevoryas wrote:
> 
> 
>> On May 3, 2022, at 2:35 PM, Cédric Le Goater <clg@kaod.org> wrote:
>>
>> On 5/3/22 22:44, Peter Delevoryas wrote:
>>> Hey everyone,
>>> I'm submitting another Facebook (Meta Platforms) machine type: this time I'm
>>> including an acceptance test too.
>>> Unfortunately, this machine boots _very_ slowly. 300+ seconds.
>>
>> This is too much for avocado tests.

Use:

   @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
   @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'),
               'Big initramfs and run from flash')

> Erg, yeah I figured as much. I’ll just resubmit it without the avocado test then,
> if that sounds ok to you.

No, please keep the test. While it won't run on CI, we can run it 
locally, very useful to bisect.

>>> I'm not sure why this is (so I don't know how to fix it easily)
>>
>> The fuji has the same kind of problem. It takes ages to load the lzma ramdisk.
>> Could it be a modeling issue ? or how the FW image is compiled ?
> 
> Yeah, one reason is that Facebook OpenBMC machines have an unnecessarily
> big initramfs that includes all the rootfs stuff, whereas regular OpenBMC
> machines have a smaller initramfs right? I don’t entirely know what I’m talking
> about though.
> 
> I think most FB machines have moved to zstd compression recently though,
> but this one may have been missed: I can fix that on the image side. I’ll
> also experiment more to see if it’s something wrong with the image, or possibly
> a regression in QEMU. It would really be super awesome if it could boot faster,
> so I’m very motivated to find a solution.

Re: [PATCH 0/1] hw/arm/aspeed: Add fby35 machine type
Posted by Peter Delevoryas 1 year, 11 months ago

> On May 30, 2022, at 8:29 AM, Philippe Mathieu-Daudé via <qemu-arm@nongnu.org> wrote:
> 
> On 4/5/22 00:47, Peter Delevoryas wrote:
>>> On May 3, 2022, at 2:35 PM, Cédric Le Goater <clg@kaod.org> wrote:
>>> 
>>> On 5/3/22 22:44, Peter Delevoryas wrote:
>>>> Hey everyone,
>>>> I'm submitting another Facebook (Meta Platforms) machine type: this time I'm
>>>> including an acceptance test too.
>>>> Unfortunately, this machine boots _very_ slowly. 300+ seconds.
>>> 
>>> This is too much for avocado tests.
> 
> Use:
> 
>  @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
>  @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'),
>              'Big initramfs and run from flash')

Thanks for this suggestion!

> 
>> Erg, yeah I figured as much. I’ll just resubmit it without the avocado test then,
>> if that sounds ok to you.
> 
> No, please keep the test. While it won't run on CI, we can run it locally, very useful to bisect.

Ok, I’d be happy to resubmit the test now with the @skipIf and @skipUnless decorators
(Since the machine definition has been merged at this point).

> 
>>>> I'm not sure why this is (so I don't know how to fix it easily)
>>> 
>>> The fuji has the same kind of problem. It takes ages to load the lzma ramdisk.
>>> Could it be a modeling issue ? or how the FW image is compiled ?
>> Yeah, one reason is that Facebook OpenBMC machines have an unnecessarily
>> big initramfs that includes all the rootfs stuff, whereas regular OpenBMC
>> machines have a smaller initramfs right? I don’t entirely know what I’m talking
>> about though.
>> I think most FB machines have moved to zstd compression recently though,
>> but this one may have been missed: I can fix that on the image side. I’ll
>> also experiment more to see if it’s something wrong with the image, or possibly
>> a regression in QEMU. It would really be super awesome if it could boot faster,
>> so I’m very motivated to find a solution.
> 

Re: [PATCH 0/1] hw/arm/aspeed: Add fby35 machine type
Posted by Cédric Le Goater 1 year, 12 months ago
On 5/4/22 00:47, Peter Delevoryas wrote:
> 
> 
>> On May 3, 2022, at 2:35 PM, Cédric Le Goater <clg@kaod.org> wrote:
>>
>> On 5/3/22 22:44, Peter Delevoryas wrote:
>>> Hey everyone,
>>> I'm submitting another Facebook (Meta Platforms) machine type: this time I'm
>>> including an acceptance test too.
>>> Unfortunately, this machine boots _very_ slowly. 300+ seconds.
>>
>> This is too much for avocado tests.
> 
> Erg, yeah I figured as much. I’ll just resubmit it without the avocado test then,
> if that sounds ok to you.
> 
>>
>>> I'm not sure why this is (so I don't know how to fix it easily)
>>
>> The fuji has the same kind of problem. It takes ages to load the lzma ramdisk.
>> Could it be a modeling issue ? or how the FW image is compiled ?
> 
> Yeah, one reason is that Facebook OpenBMC machines have an unnecessarily
> big initramfs that includes all the rootfs stuff, 

Indeed,

    Trying 'ramdisk@1' ramdisk subimage
      Description:  RAMDISK
      Type:         RAMDisk Image
      Compression:  lzma compressed
      Data Start:   0x2047da18
      Data Size:    21938373 Bytes = 20.9 MiB

That doesn't help for sure.

> whereas regular OpenBMC machines have a smaller initramfs right? 

yes, about 1MB.

> I don’t entirely know what I’m talking about though.
> 
> I think most FB machines have moved to zstd compression recently though,
> but this one may have been missed: I can fix that on the image side. I’ll
> also experiment more to see if it’s something wrong with the image, or possibly
> a regression in QEMU. It would really be super awesome if it could boot faster,
> so I’m very motivated to find a solution.

there is something else because loading the kernel on the fuji takes
much longer than on the ast2600-evb and it is the same size :

    Trying 'kernel@1' kernel subimage
      Description:  Linux kernel
      Type:         Kernel Image
      Compression:  uncompressed
      Data Start:   0x201000e0
      Data Size:    3659848 Bytes = 3.5 MiB


Is uboot doing some special CPU configuration which would slow down
emulation ? Try profiling may be.

Thanks,

C.

Re: [PATCH 0/1] hw/arm/aspeed: Add fby35 machine type
Posted by Peter Delevoryas 1 year, 12 months ago

> On May 3, 2022, at 3:47 PM, Peter Delevoryas <pdel@fb.com> wrote:
> 
> 
> 
>> On May 3, 2022, at 2:35 PM, Cédric Le Goater <clg@kaod.org> wrote:
>> 
>> On 5/3/22 22:44, Peter Delevoryas wrote:
>>> Hey everyone,
>>> I'm submitting another Facebook (Meta Platforms) machine type: this time I'm
>>> including an acceptance test too.
>>> Unfortunately, this machine boots _very_ slowly. 300+ seconds. 
>> 
>> This is too much for avocado tests.
> 
> Erg, yeah I figured as much. I’ll just resubmit it without the avocado test then,
> if that sounds ok to you.
> 
>> 
>>> I'm not sure why this is (so I don't know how to fix it easily)
>> 
>> The fuji has the same kind of problem. It takes ages to load the lzma ramdisk.
>> Could it be a modeling issue ? or how the FW image is compiled ?
> 
> Yeah, one reason is that Facebook OpenBMC machines have an unnecessarily
> big initramfs that includes all the rootfs stuff, whereas regular OpenBMC
> machines have a smaller initramfs right? I don’t entirely know what I’m talking
> about though.
> 
> I think most FB machines have moved to zstd compression recently though,
> but this one may have been missed: I can fix that on the image side. I’ll
> also experiment more to see if it’s something wrong with the image, or possibly
> a regression in QEMU. It would really be super awesome if it could boot faster,
> so I’m very motivated to find a solution.

Oh: I forgot, somebody reminded me, we also execute early U-Boot SPL code in-flash,
e.g. without SRAM/etc. That is also probably different from most other machines.

> 
>> 
>> Thanks,
>> 
>> C.
>> 
>> 
>>> and I don't know how to override
>>> the avocado test timeout just for a single test, so I increased the global
>>> timeout for all "boot_linux_console.py" tests from 90s to 400s. I doubt this is
>>> acceptable, but what other option is there? Should I add
>>> AVOCADO_TIMEOUT_EXPECTED?
>>> @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
>>> What is the point of this environment variable though, except to skip it in CIT?
>>> If I run the test with this environment variable defined, it doesn't disable the
>>> timeout, it just skips it right? I want an option to run this test with a larger
>>> timeout.
>>> Thanks,
>>> Peter
>>> Peter Delevoryas (1):
>>>  hw/arm/aspeed: Add fby35 machine type
>>> hw/arm/aspeed.c                     | 63 +++++++++++++++++++++++++++++
>>> tests/avocado/boot_linux_console.py | 20 ++++++++-
>>> 2 files changed, 82 insertions(+), 1 deletion(-)
>> 
>