[v1] Test updates (tuxrun tests, new QTest maintainer, ...)

[PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Thomas Huth 1 day, 9 hours ago

The following changes since commit f1dd640896ee2b50cb34328f2568aad324702954:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-18 10:42:56 +0100)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-21

for you to fetch changes up to ee772a332af8f23acf604ad0fb5132f886b0eb16:

  tests/functional: Convert the Avocado sh4 tuxrun test (2024-10-21 13:25:12 +0200)

----------------------------------------------------------------
* Convert the Tuxrun Avocado tests to the new functional framework
* Update the OpenBSD CI image to OpenBSD v7.6
* Bump timeout of the ide-test
* New maintainer for the QTests
* Disable the pci-bridge on s390x by default

----------------------------------------------------------------
Brad Smith (1):
      tests/vm: update openbsd image to 7.6

Peter Maydell (1):
      tests/qtest: Raise the ide-test timeout

Thomas Huth (19):
      MAINTAINERS: A new maintainer for the qtests
      hw/pci-bridge: Add a Kconfig switch for the normal PCI bridge
      tests/functional: Add a base class for the TuxRun tests
      tests/functional: Convert the Avocado ppc64 tuxrun tests
      tests/functional: Convert the Avocado aarch64 tuxrun tests
      tests/functional: Convert the Avocado sparc64 tuxrun test
      tests/functional: Convert the Avocado s390x tuxrun test
      tests/functional: Convert the Avocado arm tuxrun tests
      tests/functional: Convert the Avocado riscv32 tuxrun tests
      tests/functional: Convert the Avocado riscv64 tuxrun tests
      tests/functional: Convert the Avocado i386 tuxrun test
      tests/functional: Convert the Avocado x86_64 tuxrun test
      tests/functional: Convert the Avocado mips tuxrun test
      tests/functional: Convert the Avocado mipsel tuxrun test
      tests/functional: Convert the Avocado mips64 tuxrun test
      tests/functional: Convert the Avocado mips64el tuxrun test
      tests/functional: Convert the Avocado ppc32 tuxrun test
      Revert "hw/sh4/r2d: Realize IDE controller before accessing it"
      tests/functional: Convert the Avocado sh4 tuxrun test

 MAINTAINERS                              |  22 +-
 hw/sh4/r2d.c                             |   2 +-
 hw/pci-bridge/Kconfig                    |   5 +
 hw/pci-bridge/meson.build                |   2 +-
 tests/avocado/tuxrun_baselines.py        | 620 -------------------------------
 tests/functional/meson.build             |  28 ++
 tests/functional/qemu_test/tuxruntest.py | 158 ++++++++
 tests/functional/test_aarch64_tuxrun.py  |  50 +++
 tests/functional/test_arm_tuxrun.py      |  70 ++++
 tests/functional/test_i386_tuxrun.py     |  35 ++
 tests/functional/test_mips64_tuxrun.py   |  35 ++
 tests/functional/test_mips64el_tuxrun.py |  35 ++
 tests/functional/test_mips_tuxrun.py     |  36 ++
 tests/functional/test_mipsel_tuxrun.py   |  36 ++
 tests/functional/test_ppc64_tuxrun.py    | 110 ++++++
 tests/functional/test_ppc_tuxrun.py      |  35 ++
 tests/functional/test_riscv32_tuxrun.py  |  38 ++
 tests/functional/test_riscv64_tuxrun.py  |  38 ++
 tests/functional/test_s390x_tuxrun.py    |  34 ++
 tests/functional/test_sh4_tuxrun.py      |  57 +++
 tests/functional/test_sparc64_tuxrun.py  |  34 ++
 tests/functional/test_x86_64_tuxrun.py   |  36 ++
 tests/qtest/meson.build                  |   1 +
 tests/vm/openbsd                         |   6 +-
 24 files changed, 891 insertions(+), 632 deletions(-)
 delete mode 100644 tests/avocado/tuxrun_baselines.py
 create mode 100644 tests/functional/qemu_test/tuxruntest.py
 create mode 100755 tests/functional/test_aarch64_tuxrun.py
 create mode 100755 tests/functional/test_arm_tuxrun.py
 create mode 100755 tests/functional/test_i386_tuxrun.py
 create mode 100755 tests/functional/test_mips64_tuxrun.py
 create mode 100755 tests/functional/test_mips64el_tuxrun.py
 create mode 100755 tests/functional/test_mips_tuxrun.py
 create mode 100755 tests/functional/test_mipsel_tuxrun.py
 create mode 100755 tests/functional/test_ppc64_tuxrun.py
 create mode 100755 tests/functional/test_ppc_tuxrun.py
 create mode 100755 tests/functional/test_riscv32_tuxrun.py
 create mode 100755 tests/functional/test_riscv64_tuxrun.py
 create mode 100755 tests/functional/test_s390x_tuxrun.py
 create mode 100755 tests/functional/test_sh4_tuxrun.py
 create mode 100755 tests/functional/test_sparc64_tuxrun.py
 create mode 100755 tests/functional/test_x86_64_tuxrun.py

Re: [PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Peter Maydell 1 day, 7 hours ago

On Mon, 21 Oct 2024 at 12:35, Thomas Huth <thuth@redhat.com> wrote:
>
> The following changes since commit f1dd640896ee2b50cb34328f2568aad324702954:
>
>   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-18 10:42:56 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-21
>
> for you to fetch changes up to ee772a332af8f23acf604ad0fb5132f886b0eb16:
>
>   tests/functional: Convert the Avocado sh4 tuxrun test (2024-10-21 13:25:12 +0200)
>
> ----------------------------------------------------------------
> * Convert the Tuxrun Avocado tests to the new functional framework
> * Update the OpenBSD CI image to OpenBSD v7.6
> * Bump timeout of the ide-test
> * New maintainer for the QTests
> * Disable the pci-bridge on s390x by default
>
> ----------------------------------------------------------------

Couple of failures on the functional-tests:

https://gitlab.com/qemu-project/qemu/-/jobs/8140716604

7/28 qemu:func-thorough+func-aarch64-thorough+thorough /
func-aarch64-aarch64_tuxrun TIMEOUT 120.06s killed by signal 15
SIGTERM

https://gitlab.com/qemu-project/qemu/-/jobs/8140716520

14/17 qemu:func-thorough+func-loongarch64-thorough+thorough /
func-loongarch64-loongarch64_virt TIMEOUT 60.09s killed by signal 15
SIGTERM

I'm retrying to see if these are intermittent, but they
suggest that we should bump the timeout for these.

thanks
-- PMM

Re: [PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Thomas Huth 1 day, 7 hours ago

On 21/10/2024 15.00, Peter Maydell wrote:
> On Mon, 21 Oct 2024 at 12:35, Thomas Huth <thuth@redhat.com> wrote:
>>
>> The following changes since commit f1dd640896ee2b50cb34328f2568aad324702954:
>>
>>    Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-18 10:42:56 +0100)
>>
>> are available in the Git repository at:
>>
>>    https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-21
>>
>> for you to fetch changes up to ee772a332af8f23acf604ad0fb5132f886b0eb16:
>>
>>    tests/functional: Convert the Avocado sh4 tuxrun test (2024-10-21 13:25:12 +0200)
>>
>> ----------------------------------------------------------------
>> * Convert the Tuxrun Avocado tests to the new functional framework
>> * Update the OpenBSD CI image to OpenBSD v7.6
>> * Bump timeout of the ide-test
>> * New maintainer for the QTests
>> * Disable the pci-bridge on s390x by default
>>
>> ----------------------------------------------------------------
> 
> Couple of failures on the functional-tests:
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/8140716604
> 
> 7/28 qemu:func-thorough+func-aarch64-thorough+thorough /
> func-aarch64-aarch64_tuxrun TIMEOUT 120.06s killed by signal 15
> SIGTERM
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/8140716520
> 
> 14/17 qemu:func-thorough+func-loongarch64-thorough+thorough /
> func-loongarch64-loongarch64_virt TIMEOUT 60.09s killed by signal 15
> SIGTERM
> 
> I'm retrying to see if these are intermittent, but they
> suggest that we should bump the timeout for these.

Everything was fine with the gitlab shared runners 
(https://gitlab.com/thuth/qemu/-/pipelines/1504882880), but yes, it's likely 
the private runners being slow again...

So please don't merge it yet, I'll go through the jobs of the private 
runners and update the timeouts of the failed jobs and the ones where it is 
getting close to the limit.

  Thomas

Re: [PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Thomas Huth 1 day, 6 hours ago

On 21/10/2024 15.18, Thomas Huth wrote:
> On 21/10/2024 15.00, Peter Maydell wrote:
>> On Mon, 21 Oct 2024 at 12:35, Thomas Huth <thuth@redhat.com> wrote:
>>>
>>> The following changes since commit f1dd640896ee2b50cb34328f2568aad324702954:
>>>
>>>    Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into 
>>> staging (2024-10-18 10:42:56 +0100)
>>>
>>> are available in the Git repository at:
>>>
>>>    https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-21
>>>
>>> for you to fetch changes up to ee772a332af8f23acf604ad0fb5132f886b0eb16:
>>>
>>>    tests/functional: Convert the Avocado sh4 tuxrun test (2024-10-21 
>>> 13:25:12 +0200)
>>>
>>> ----------------------------------------------------------------
>>> * Convert the Tuxrun Avocado tests to the new functional framework
>>> * Update the OpenBSD CI image to OpenBSD v7.6
>>> * Bump timeout of the ide-test
>>> * New maintainer for the QTests
>>> * Disable the pci-bridge on s390x by default
>>>
>>> ----------------------------------------------------------------
>>
>> Couple of failures on the functional-tests:
>>
>> https://gitlab.com/qemu-project/qemu/-/jobs/8140716604
>>
>> 7/28 qemu:func-thorough+func-aarch64-thorough+thorough /
>> func-aarch64-aarch64_tuxrun TIMEOUT 120.06s killed by signal 15
>> SIGTERM
>>
>> https://gitlab.com/qemu-project/qemu/-/jobs/8140716520
>>
>> 14/17 qemu:func-thorough+func-loongarch64-thorough+thorough /
>> func-loongarch64-loongarch64_virt TIMEOUT 60.09s killed by signal 15
>> SIGTERM
>>
>> I'm retrying to see if these are intermittent, but they
>> suggest that we should bump the timeout for these.
> 
> Everything was fine with the gitlab shared runners (https://gitlab.com/ 
> thuth/qemu/-/pipelines/1504882880), but yes, it's likely the private runners 
> being slow again...
> 
> So please don't merge it yet, I'll go through the jobs of the private 
> runners and update the timeouts of the failed jobs and the ones where it is 
> getting close to the limit.

Actually, looking at it again, the func-loongarch64-loongarch64_virt test is 
not a new one, this has been merged quite a while ago already. And in 
previous runs, it only took 6 - 10 seconds:

  https://gitlab.com/qemu-project/qemu/-/jobs/8125336852#L810
  https://gitlab.com/qemu-project/qemu/-/jobs/8111434905#L740

So maybe this was just a temporary blip in the test runners indeed? Could 
you please try to rerun the jobs to see how long they take then?

  Thanks
   Thomas

Re: [PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Peter Maydell 1 day, 6 hours ago

On Mon, 21 Oct 2024 at 14:55, Thomas Huth <thuth@redhat.com> wrote:
>
> On 21/10/2024 15.18, Thomas Huth wrote:
> > On 21/10/2024 15.00, Peter Maydell wrote:
> >> On Mon, 21 Oct 2024 at 12:35, Thomas Huth <thuth@redhat.com> wrote:
> >>>
> >>> The following changes since commit f1dd640896ee2b50cb34328f2568aad324702954:
> >>>
> >>>    Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into
> >>> staging (2024-10-18 10:42:56 +0100)
> >>>
> >>> are available in the Git repository at:
> >>>
> >>>    https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-21
> >>>
> >>> for you to fetch changes up to ee772a332af8f23acf604ad0fb5132f886b0eb16:
> >>>
> >>>    tests/functional: Convert the Avocado sh4 tuxrun test (2024-10-21
> >>> 13:25:12 +0200)
> >>>
> >>> ----------------------------------------------------------------
> >>> * Convert the Tuxrun Avocado tests to the new functional framework
> >>> * Update the OpenBSD CI image to OpenBSD v7.6
> >>> * Bump timeout of the ide-test
> >>> * New maintainer for the QTests
> >>> * Disable the pci-bridge on s390x by default
> >>>
> >>> ----------------------------------------------------------------
> >>
> >> Couple of failures on the functional-tests:
> >>
> >> https://gitlab.com/qemu-project/qemu/-/jobs/8140716604
> >>
> >> 7/28 qemu:func-thorough+func-aarch64-thorough+thorough /
> >> func-aarch64-aarch64_tuxrun TIMEOUT 120.06s killed by signal 15
> >> SIGTERM
> >>
> >> https://gitlab.com/qemu-project/qemu/-/jobs/8140716520
> >>
> >> 14/17 qemu:func-thorough+func-loongarch64-thorough+thorough /
> >> func-loongarch64-loongarch64_virt TIMEOUT 60.09s killed by signal 15
> >> SIGTERM
> >>
> >> I'm retrying to see if these are intermittent, but they
> >> suggest that we should bump the timeout for these.
> >
> > Everything was fine with the gitlab shared runners (https://gitlab.com/
> > thuth/qemu/-/pipelines/1504882880), but yes, it's likely the private runners
> > being slow again...
> >
> > So please don't merge it yet, I'll go through the jobs of the private
> > runners and update the timeouts of the failed jobs and the ones where it is
> > getting close to the limit.
>
> Actually, looking at it again, the func-loongarch64-loongarch64_virt test is
> not a new one, this has been merged quite a while ago already. And in
> previous runs, it only took 6 - 10 seconds:
>
>   https://gitlab.com/qemu-project/qemu/-/jobs/8125336852#L810
>   https://gitlab.com/qemu-project/qemu/-/jobs/8111434905#L740
>
> So maybe this was just a temporary blip in the test runners indeed? Could
> you please try to rerun the jobs to see how long they take then?

The alpine job passed on the retry:
https://gitlab.com/qemu-project/qemu/-/jobs/8141648479
and the func-loongarch64-loongarch64_virt test took 5.08s.

The opensuse job failed again:
https://gitlab.com/qemu-project/qemu/-/jobs/8141649069
7/28 qemu:func-thorough+func-aarch64-thorough+thorough /
func-aarch64-aarch64_tuxrun TIMEOUT 120.04s killed by signal 15
SIGTERM

-- PMM

Re: [PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Thomas Huth 1 day, 6 hours ago

On 21/10/2024 15.59, Peter Maydell wrote:
> On Mon, 21 Oct 2024 at 14:55, Thomas Huth <thuth@redhat.com> wrote:
>>
>> On 21/10/2024 15.18, Thomas Huth wrote:
>>> On 21/10/2024 15.00, Peter Maydell wrote:
>>>> On Mon, 21 Oct 2024 at 12:35, Thomas Huth <thuth@redhat.com> wrote:
>>>>>
>>>>> The following changes since commit f1dd640896ee2b50cb34328f2568aad324702954:
>>>>>
>>>>>     Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into
>>>>> staging (2024-10-18 10:42:56 +0100)
>>>>>
>>>>> are available in the Git repository at:
>>>>>
>>>>>     https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-21
>>>>>
>>>>> for you to fetch changes up to ee772a332af8f23acf604ad0fb5132f886b0eb16:
>>>>>
>>>>>     tests/functional: Convert the Avocado sh4 tuxrun test (2024-10-21
>>>>> 13:25:12 +0200)
>>>>>
>>>>> ----------------------------------------------------------------
>>>>> * Convert the Tuxrun Avocado tests to the new functional framework
>>>>> * Update the OpenBSD CI image to OpenBSD v7.6
>>>>> * Bump timeout of the ide-test
>>>>> * New maintainer for the QTests
>>>>> * Disable the pci-bridge on s390x by default
>>>>>
>>>>> ----------------------------------------------------------------
>>>>
>>>> Couple of failures on the functional-tests:
>>>>
>>>> https://gitlab.com/qemu-project/qemu/-/jobs/8140716604
>>>>
>>>> 7/28 qemu:func-thorough+func-aarch64-thorough+thorough /
>>>> func-aarch64-aarch64_tuxrun TIMEOUT 120.06s killed by signal 15
>>>> SIGTERM
>>>>
>>>> https://gitlab.com/qemu-project/qemu/-/jobs/8140716520
>>>>
>>>> 14/17 qemu:func-thorough+func-loongarch64-thorough+thorough /
>>>> func-loongarch64-loongarch64_virt TIMEOUT 60.09s killed by signal 15
>>>> SIGTERM
>>>>
>>>> I'm retrying to see if these are intermittent, but they
>>>> suggest that we should bump the timeout for these.
>>>
>>> Everything was fine with the gitlab shared runners (https://gitlab.com/
>>> thuth/qemu/-/pipelines/1504882880), but yes, it's likely the private runners
>>> being slow again...
>>>
>>> So please don't merge it yet, I'll go through the jobs of the private
>>> runners and update the timeouts of the failed jobs and the ones where it is
>>> getting close to the limit.
>>
>> Actually, looking at it again, the func-loongarch64-loongarch64_virt test is
>> not a new one, this has been merged quite a while ago already. And in
>> previous runs, it only took 6 - 10 seconds:
>>
>>    https://gitlab.com/qemu-project/qemu/-/jobs/8125336852#L810
>>    https://gitlab.com/qemu-project/qemu/-/jobs/8111434905#L740
>>
>> So maybe this was just a temporary blip in the test runners indeed? Could
>> you please try to rerun the jobs to see how long they take then?
> 
> The alpine job passed on the retry:
> https://gitlab.com/qemu-project/qemu/-/jobs/8141648479
> and the func-loongarch64-loongarch64_virt test took 5.08s.
> 
> The opensuse job failed again:
> https://gitlab.com/qemu-project/qemu/-/jobs/8141649069
> 7/28 qemu:func-thorough+func-aarch64-thorough+thorough /
> func-aarch64-aarch64_tuxrun TIMEOUT 120.04s killed by signal 15
> SIGTERM

Looking at the log files of the job, I can see in 
https://gitlab.com/qemu-project/qemu/-/jobs/8141649069/artifacts/browse/build/tests/functional/aarch64/test_aarch64_tuxrun.TuxRunAarch64Test.test_arm64be/ 
console.log:

2024-10-21 13:20:32,844: Run /sbin/init as init process
2024-10-21 13:20:34,043: EXT4-fs (vda): re-mounted. Opts: (null). Quota 
mode: none.
2024-10-21 13:20:34,350: Starting syslogd: OK
2024-10-21 13:20:34,423: Starting klogd: OK
2024-10-21 13:20:34,667: Running sysctl: OK
2024-10-21 13:20:34,739: Saving 2048 bits of non-creditable seed for next boot
2024-10-21 13:20:34,966: Starting network: blk_update_request: I/O error, 
dev vda, sector 5824 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
2024-10-21 13:20:35,028: blk_update_request: I/O error, dev vda, sector 8848 
op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
2024-10-21 13:20:35,051: OK
2024-10-21 13:20:35,088: blk_update_request: I/O error, dev vda, sector 
12936 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
2024-10-21 13:20:35,149: blk_update_request: I/O error, dev vda, sector 
17032 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
2024-10-21 13:20:35,181: Welcome to TuxTest
2024-10-21 13:20:35,882: tuxtest login: blk_update_request: I/O error, dev 
vda, sector 21128 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
2024-10-21 13:20:35,882: blk_update_request: I/O error, dev vda, sector 
25224 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
2024-10-21 13:20:35,882: blk_update_request: I/O error, dev vda, sector 
29320 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
2024-10-21 13:20:35,887: root

So this is indeed more than just a timeout setting that is too small...
I don't get the virtio errors when running the test locally, though.
I guess this needs some more investigation first ... maybe best if I respin 
the PR without this patch for now 'til this is understood and fixed.

  Thomas

Re: [PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Peter Maydell 1 day, 4 hours ago

On Mon, 21 Oct 2024 at 15:11, Thomas Huth <thuth@redhat.com> wrote:
> Looking at the log files of the job, I can see in
> https://gitlab.com/qemu-project/qemu/-/jobs/8141649069/artifacts/browse/build/tests/functional/aarch64/test_aarch64_tuxrun.TuxRunAarch64Test.test_arm64be/
> console.log:
>
> 2024-10-21 13:20:32,844: Run /sbin/init as init process
> 2024-10-21 13:20:34,043: EXT4-fs (vda): re-mounted. Opts: (null). Quota
> mode: none.
> 2024-10-21 13:20:34,350: Starting syslogd: OK
> 2024-10-21 13:20:34,423: Starting klogd: OK
> 2024-10-21 13:20:34,667: Running sysctl: OK
> 2024-10-21 13:20:34,739: Saving 2048 bits of non-creditable seed for next boot
> 2024-10-21 13:20:34,966: Starting network: blk_update_request: I/O error,
> dev vda, sector 5824 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
> 2024-10-21 13:20:35,028: blk_update_request: I/O error, dev vda, sector 8848
> op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
> 2024-10-21 13:20:35,051: OK
> 2024-10-21 13:20:35,088: blk_update_request: I/O error, dev vda, sector
> 12936 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
> 2024-10-21 13:20:35,149: blk_update_request: I/O error, dev vda, sector
> 17032 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
> 2024-10-21 13:20:35,181: Welcome to TuxTest
> 2024-10-21 13:20:35,882: tuxtest login: blk_update_request: I/O error, dev
> vda, sector 21128 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
> 2024-10-21 13:20:35,882: blk_update_request: I/O error, dev vda, sector
> 25224 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
> 2024-10-21 13:20:35,882: blk_update_request: I/O error, dev vda, sector
> 29320 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
> 2024-10-21 13:20:35,887: root
>
> So this is indeed more than just a timeout setting that is too small...
> I don't get the virtio errors when running the test locally, though.
> I guess this needs some more investigation first ... maybe best if I respin
> the PR without this patch for now 'til this is understood and fixed.

I guess big-endian is one of the setups most likely to be
broken :-)

-- PMM

Re: [PULL 00/21] Test updates (tuxrun tests, new QTest maintainer, ...)

Posted by Thomas Huth 1 day, 4 hours ago

On 21/10/2024 17.39, Peter Maydell wrote:
> On Mon, 21 Oct 2024 at 15:11, Thomas Huth <thuth@redhat.com> wrote:
>> Looking at the log files of the job, I can see in
>> https://gitlab.com/qemu-project/qemu/-/jobs/8141649069/artifacts/browse/build/tests/functional/aarch64/test_aarch64_tuxrun.TuxRunAarch64Test.test_arm64be/
>> console.log:
>>
>> 2024-10-21 13:20:32,844: Run /sbin/init as init process
>> 2024-10-21 13:20:34,043: EXT4-fs (vda): re-mounted. Opts: (null). Quota
>> mode: none.
>> 2024-10-21 13:20:34,350: Starting syslogd: OK
>> 2024-10-21 13:20:34,423: Starting klogd: OK
>> 2024-10-21 13:20:34,667: Running sysctl: OK
>> 2024-10-21 13:20:34,739: Saving 2048 bits of non-creditable seed for next boot
>> 2024-10-21 13:20:34,966: Starting network: blk_update_request: I/O error,
>> dev vda, sector 5824 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
>> 2024-10-21 13:20:35,028: blk_update_request: I/O error, dev vda, sector 8848
>> op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
>> 2024-10-21 13:20:35,051: OK
>> 2024-10-21 13:20:35,088: blk_update_request: I/O error, dev vda, sector
>> 12936 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
>> 2024-10-21 13:20:35,149: blk_update_request: I/O error, dev vda, sector
>> 17032 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
>> 2024-10-21 13:20:35,181: Welcome to TuxTest
>> 2024-10-21 13:20:35,882: tuxtest login: blk_update_request: I/O error, dev
>> vda, sector 21128 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
>> 2024-10-21 13:20:35,882: blk_update_request: I/O error, dev vda, sector
>> 25224 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
>> 2024-10-21 13:20:35,882: blk_update_request: I/O error, dev vda, sector
>> 29320 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
>> 2024-10-21 13:20:35,887: root
>>
>> So this is indeed more than just a timeout setting that is too small...
>> I don't get the virtio errors when running the test locally, though.
>> I guess this needs some more investigation first ... maybe best if I respin
>> the PR without this patch for now 'til this is understood and fixed.
> 
> I guess big-endian is one of the setups most likely to be
> broken :-)

The weird thing is that the old version of the test (avocado based) still 
seems to work fine. And if I run the test locally, I'm also sometimes seeing 
these errors in the console.log now, but they occur just later, so the test 
still finishs successfully... I'll try to have a closer look later, but I 
currently don't have time for such debugging :-(

  Thomas