[PATCH v2 1/2] tests/avocado: use default amount of cores on sbsa-ref

Marcin Juszkiewicz posted 2 patches 5 months, 1 week ago
Maintainers: Radoslaw Biernacki <rad@semihalf.com>, Peter Maydell <peter.maydell@linaro.org>, Leif Lindholm <quic_llindhol@quicinc.com>, Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>, Cleber Rosa <crosa@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Beraldo Leal <bleal@redhat.com>
There is a newer version of this series
[PATCH v2 1/2] tests/avocado: use default amount of cores on sbsa-ref
Posted by Marcin Juszkiewicz 5 months, 1 week ago
I was wondering why avocado tests passed with firmware which crashes
when anyone else is using it.

Turned out that amount of cores matters. Have to find out why still.

Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
---
 tests/avocado/machine_aarch64_sbsaref.py | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tests/avocado/machine_aarch64_sbsaref.py b/tests/avocado/machine_aarch64_sbsaref.py
index 6bb82f2a03..136b495096 100644
--- a/tests/avocado/machine_aarch64_sbsaref.py
+++ b/tests/avocado/machine_aarch64_sbsaref.py
@@ -75,8 +75,6 @@ def fetch_firmware(self):
             f"if=pflash,file={fs0_path},format=raw",
             "-drive",
             f"if=pflash,file={fs1_path},format=raw",
-            "-smp",
-            "1",
             "-machine",
             "sbsa-ref",
         )
-- 
2.45.1
Re: [PATCH v2 1/2] tests/avocado: use default amount of cores on sbsa-ref
Posted by Peter Maydell 5 months, 1 week ago
On Thu, 20 Jun 2024 at 07:00, Marcin Juszkiewicz
<marcin.juszkiewicz@linaro.org> wrote:
>
> I was wondering why avocado tests passed with firmware which crashes
> when anyone else is using it.
>
> Turned out that amount of cores matters. Have to find out why still.

This commit message confuses me. It reads like "running with
two cores will make the guest crash", i.e. "apply this patch
and the test suite will stop passing". I assume that's not
the case, but what's actually going on here?

thanks
-- PMM
Re: [PATCH v2 1/2] tests/avocado: use default amount of cores on sbsa-ref
Posted by Marcin Juszkiewicz 5 months, 1 week ago
W dniu 20.06.2024 o 11:34, Peter Maydell pisze:
> On Thu, 20 Jun 2024 at 07:00, Marcin Juszkiewicz 
> <marcin.juszkiewicz@linaro.org> wrote:
>> 
>> I was wondering why avocado tests passed with firmware which
>> crashes when anyone else is using it.
>> 
>> Turned out that amount of cores matters. Have to find out why
>> still.
> 
> This commit message confuses me.

Had no idea how to write in more readable form. Will reword it for v3 
(with reverse order of patches as recommended by Philippe.

> It reads like "running with two cores will make the guest crash",
> i.e. "apply this patch and the test suite will stop passing". I
> assume that's not the case, but what's actually going on here?

That's exactly the case. With sbsa-ref firmware which qemu uses now we 
have crash if more than 1 core is used. Avocado test hardcoded "-smp 1" 
and was passing fine.

And I forgot to mail qemu-devel when I got hit by that crash.

This week Rebecca Cran pointed me that crash is in BootLogoLib in EDK2 
and I wrote some workaround for make things work. Then Ard Biesheuvel 
found the real reason, fixed QemuVideoDxe in EDK2 and we got sbsa-ref 
running with any amount of cores.

The commit message of fix:

commit c1d1910be6e04a8b1a73090cf2881fb698947a6e
Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Mon Jun 17 17:07:41 2024 +0200

OvmfPkg/QemuVideoDxe: add feature PCD to remap framebuffer W/C

Some platforms (such as SBSA-QEMU on recent builds of the emulator) only
tolerate misaligned accesses to normal memory, and raise alignment
faults on such accesses to device memory, which is the default for PCIe
MMIO BARs.

When emulating a PCIe graphics controller, the framebuffer is typically
exposed via a MMIO BAR, while the disposition of the region is closer to
memory (no side effects on reads or writes, except for the changing
picture on the screen; direct random access to any pixel in the image).

In order to permit the use of such controllers on platforms that only
tolerate these types of accesses for normal memory, it is necessary to
remap the memory. Use the DXE services to set the desired capabilities
and attributes.

Hide this behavior under a feature PCD so only platforms that really
need it can enable it. (OVMF on x86 has no need for this)
Re: [PATCH v2 1/2] tests/avocado: use default amount of cores on sbsa-ref
Posted by Peter Maydell 5 months, 1 week ago
On Thu, 20 Jun 2024 at 10:55, Marcin Juszkiewicz
<marcin.juszkiewicz@linaro.org> wrote:
>
> W dniu 20.06.2024 o 11:34, Peter Maydell pisze:
> > On Thu, 20 Jun 2024 at 07:00, Marcin Juszkiewicz
> > <marcin.juszkiewicz@linaro.org> wrote:
> >>
> >> I was wondering why avocado tests passed with firmware which
> >> crashes when anyone else is using it.
> >>
> >> Turned out that amount of cores matters. Have to find out why
> >> still.
> >
> > This commit message confuses me.
>
> Had no idea how to write in more readable form. Will reword it for v3
> (with reverse order of patches as recommended by Philippe.
>
> > It reads like "running with two cores will make the guest crash",
> > i.e. "apply this patch and the test suite will stop passing". I
> > assume that's not the case, but what's actually going on here?
>
> That's exactly the case. With sbsa-ref firmware which qemu uses now we
> have crash if more than 1 core is used. Avocado test hardcoded "-smp 1"
> and was passing fine.
>
> And I forgot to mail qemu-devel when I got hit by that crash.
>
> This week Rebecca Cran pointed me that crash is in BootLogoLib in EDK2
> and I wrote some workaround for make things work. Then Ard Biesheuvel
> found the real reason, fixed QemuVideoDxe in EDK2 and we got sbsa-ref
> running with any amount of cores.

Oh, OK, so it's just random bad luck that enabling the second
CPU means that we end up doing an unaligned access to the
framebuffer, I guess.

Then, yes, Philippe is right and we need to update our sbsa-ref
firmware we're using for the test first, to avoid breaking bisection.

For a commit message for this patch, maybe something like:

 The version of the sbsa-ref EDK2 firmware we used to use in this
 test had a bug where it might make an unaligned access to the
 framebuffer, which causes a guest crash on newer versions of
 QEMU where we enforce the architectural requirement that
 unaligned accesses to Device memory should take an exception.
 We happened to not notice this because our test was booting with
 "-smp 1" and through luck this didn't write the boot logo to
 the framebuffer at an unaligned address; but trying to boot the
 same firmware with two CPUs would result in a guest crash.
 Now we have updated the firmware we're using for the test, we can
 make the test use all the cores on the board, so we are testing the
 SMP boot path.

?

thanks
-- PMM