tests/functional: Bump timeouts of functional tests

[PATCH] tests/functional: Bump timeouts of functional tests

Posted by Thomas Huth 1 year, 3 months ago

When building QEMU with "--enable-debug" and running the tests
in parallel with "make -j$(nproc) check-functional", many tests are
still timing out due to our conservative timeout settings. Bump
the timeouts of the problematic tests and also increase the default
timeout to 90 seconds (from 60 seconds) to be on the safe side.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 tests/functional/meson.build | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/tests/functional/meson.build b/tests/functional/meson.build
index d5296bff8b..3561f987a6 100644
--- a/tests/functional/meson.build
+++ b/tests/functional/meson.build
@@ -11,24 +11,26 @@ endif
 
 # Timeouts for individual tests that can be slow e.g. with debugging enabled
 test_timeouts = {
-  'aarch64_raspi4' : 120,
+  'aarch64_raspi4' : 480,
   'aarch64_sbsaref' : 600,
-  'aarch64_virt' : 360,
-  'acpi_bits' : 240,
+  'aarch64_virt' : 720,
+  'acpi_bits' : 420,
   'arm_aspeed' : 600,
-  'arm_bpim2u' : 360,
+  'arm_bpim2u' : 500,
+  'arm_collie' : 180,
   'arm_orangepi' : 540,
   'arm_raspi2' : 120,
-  'arm_tuxrun' : 120,
+  'arm_tuxrun' : 240,
   'arm_sx1' : 360,
   'mips_malta' : 120,
   'netdev_ethtool' : 180,
   'ppc_40p' : 240,
   'ppc64_hv' : 1000,
-  'ppc64_powernv' : 240,
-  'ppc64_pseries' : 240,
-  'ppc64_tuxrun' : 240,
-  's390x_ccw_virtio' : 240,
+  'ppc64_powernv' : 480,
+  'ppc64_pseries' : 480,
+  'ppc64_tuxrun' : 420,
+  'riscv64_tuxrun' : 120,
+  's390x_ccw_virtio' : 420,
 }
 
 tests_generic_system = [
@@ -273,8 +275,8 @@ foreach speed : ['quick', 'thorough']
            env: test_env,
            args: [testpath],
            protocol: 'tap',
-           timeout: test_timeouts.get(test, 60),
-           priority: test_timeouts.get(test, 60),
+           timeout: test_timeouts.get(test, 90),
+           priority: test_timeouts.get(test, 90),
            suite: suites)
     endforeach
   endforeach
-- 
2.47.0

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Daniel P. Berrangé 1 year, 3 months ago

On Wed, Nov 06, 2024 at 06:09:46PM +0100, Thomas Huth wrote:
> When building QEMU with "--enable-debug" and running the tests
> in parallel with "make -j$(nproc) check-functional", many tests are
> still timing out due to our conservative timeout settings. Bump
> the timeouts of the problematic tests and also increase the default
> timeout to 90 seconds (from 60 seconds) to be on the safe side.

Rather than tweak individual tests, how about we just apply
a uniform x3 multiplier when --enable-debug is present,
including on all the non-listed tests which have the default
timeout?

> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  tests/functional/meson.build | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Thomas Huth 1 year, 3 months ago

On 06/11/2024 19.04, Daniel P. Berrangé wrote:
> On Wed, Nov 06, 2024 at 06:09:46PM +0100, Thomas Huth wrote:
>> When building QEMU with "--enable-debug" and running the tests
>> in parallel with "make -j$(nproc) check-functional", many tests are
>> still timing out due to our conservative timeout settings. Bump
>> the timeouts of the problematic tests and also increase the default
>> timeout to 90 seconds (from 60 seconds) to be on the safe side.
> 
> Rather than tweak individual tests, how about we just apply
> a uniform x3 multiplier when --enable-debug is present,
> including on all the non-listed tests which have the default
> timeout?

That might be helpful, could you suggest a patch?

Anyway, I think we still should include this patch here since some people 
also still might have older, slower computers, so it would be good to have 
some relaxed timeout for certain tests anyway.

  Thomas

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Pierrick Bouvier 1 year, 3 months ago

On 11/6/24 09:09, Thomas Huth wrote:
> When building QEMU with "--enable-debug" and running the tests
> in parallel with "make -j$(nproc) check-functional", many tests are
> still timing out due to our conservative timeout settings. Bump
> the timeouts of the problematic tests and also increase the default
> timeout to 90 seconds (from 60 seconds) to be on the safe side.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>   tests/functional/meson.build | 24 +++++++++++++-----------
>   1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/tests/functional/meson.build b/tests/functional/meson.build
> index d5296bff8b..3561f987a6 100644
> --- a/tests/functional/meson.build
> +++ b/tests/functional/meson.build
> @@ -11,24 +11,26 @@ endif
>   
>   # Timeouts for individual tests that can be slow e.g. with debugging enabled
>   test_timeouts = {
> -  'aarch64_raspi4' : 120,
> +  'aarch64_raspi4' : 480,
>     'aarch64_sbsaref' : 600,
> -  'aarch64_virt' : 360,
> -  'acpi_bits' : 240,
> +  'aarch64_virt' : 720,
> +  'acpi_bits' : 420,
>     'arm_aspeed' : 600,
> -  'arm_bpim2u' : 360,
> +  'arm_bpim2u' : 500,
> +  'arm_collie' : 180,
>     'arm_orangepi' : 540,
>     'arm_raspi2' : 120,
> -  'arm_tuxrun' : 120,
> +  'arm_tuxrun' : 240,
>     'arm_sx1' : 360,
>     'mips_malta' : 120,
>     'netdev_ethtool' : 180,
>     'ppc_40p' : 240,
>     'ppc64_hv' : 1000,
> -  'ppc64_powernv' : 240,
> -  'ppc64_pseries' : 240,
> -  'ppc64_tuxrun' : 240,
> -  's390x_ccw_virtio' : 240,
> +  'ppc64_powernv' : 480,
> +  'ppc64_pseries' : 480,
> +  'ppc64_tuxrun' : 420,
> +  'riscv64_tuxrun' : 120,
> +  's390x_ccw_virtio' : 420,
>   }
>   
>   tests_generic_system = [
> @@ -273,8 +275,8 @@ foreach speed : ['quick', 'thorough']
>              env: test_env,
>              args: [testpath],
>              protocol: 'tap',
> -           timeout: test_timeouts.get(test, 60),
> -           priority: test_timeouts.get(test, 60),
> +           timeout: test_timeouts.get(test, 90),
> +           priority: test_timeouts.get(test, 90),
>              suite: suites)
>       endforeach
>     endforeach

I noticed by --enable-debug in configure is a combination of enabling 
checks (enable-debug-tcg + graph + mutex), and deactivating optimizations.

Would it be worth keeping the optimizations and runtime checks instead? 
This way, there would be no more "timeout" issue.

I'm not sure which added value we get from O0, except for debugging 
locally QEMU.

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Peter Maydell 1 year, 3 months ago

On Wed, 6 Nov 2024 at 17:21, Pierrick Bouvier
<pierrick.bouvier@linaro.org> wrote:
> I noticed by --enable-debug in configure is a combination of enabling
> checks (enable-debug-tcg + graph + mutex), and deactivating optimizations.
>
> Would it be worth keeping the optimizations and runtime checks instead?
> This way, there would be no more "timeout" issue.
>
> I'm not sure which added value we get from O0, except for debugging
> locally QEMU.

"Debugging locally QEMU" is exactly what --enable-debug is intended for...

thanks
-- PMM

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Richard Henderson 1 year, 3 months ago

On 11/6/24 17:26, Peter Maydell wrote:
> On Wed, 6 Nov 2024 at 17:21, Pierrick Bouvier
> <pierrick.bouvier@linaro.org> wrote:
>> I noticed by --enable-debug in configure is a combination of enabling
>> checks (enable-debug-tcg + graph + mutex), and deactivating optimizations.
>>
>> Would it be worth keeping the optimizations and runtime checks instead?
>> This way, there would be no more "timeout" issue.
>>
>> I'm not sure which added value we get from O0, except for debugging
>> locally QEMU.
> 
> "Debugging locally QEMU" is exactly what --enable-debug is intended for...

I think Pierrick is asking why we don't use --enable-debug-tcg for CI?
That is, enable optimization and checks since we won't be doing local debugging within CI.

r~

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Pierrick Bouvier 1 year, 3 months ago

On 11/6/24 09:26, Peter Maydell wrote:
> On Wed, 6 Nov 2024 at 17:21, Pierrick Bouvier
> <pierrick.bouvier@linaro.org> wrote:
>> I noticed by --enable-debug in configure is a combination of enabling
>> checks (enable-debug-tcg + graph + mutex), and deactivating optimizations.
>>
>> Would it be worth keeping the optimizations and runtime checks instead?
>> This way, there would be no more "timeout" issue.
>>
>> I'm not sure which added value we get from O0, except for debugging
>> locally QEMU.
> 
> "Debugging locally QEMU" is exactly what --enable-debug is intended for...
> 

Yes...
but it seems like we take it for "enable debug checks" in CI as well and 
it impacts runtime, because optimizations are deactivated. I think I've 
not been the only one confused about this.

So my point is that we should maybe differentiate the two use cases at 
configure level.

--enable-debug and
--enable-runtime-checks (or something more explicit)

> thanks
> -- PMM

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Thomas Huth 1 year, 3 months ago

On 06/11/2024 18.30, Pierrick Bouvier wrote:
> On 11/6/24 09:26, Peter Maydell wrote:
>> On Wed, 6 Nov 2024 at 17:21, Pierrick Bouvier
>> <pierrick.bouvier@linaro.org> wrote:
>>> I noticed by --enable-debug in configure is a combination of enabling
>>> checks (enable-debug-tcg + graph + mutex), and deactivating optimizations.
>>>
>>> Would it be worth keeping the optimizations and runtime checks instead?
>>> This way, there would be no more "timeout" issue.
>>>
>>> I'm not sure which added value we get from O0, except for debugging
>>> locally QEMU.
>>
>> "Debugging locally QEMU" is exactly what --enable-debug is intended for...
>>
> 
> Yes...
> but it seems like we take it for "enable debug checks" in CI as well and it 
> impacts runtime, because optimizations are deactivated. I think I've not 
> been the only one confused about this.
> 
> So my point is that we should maybe differentiate the two use cases at 
> configure level.
> 
> --enable-debug and
> --enable-runtime-checks (or something more explicit)

Would that really help? I guess people still want to be able to run "make 
check" when they compiled with --enable-debug, so we still need to be 
prepared to run the checks with a slow QEMU.

But I wonder whether we could maybe use -Og instead of -O0 nowadays?

  Thomas

Re: [PATCH] tests/functional: Bump timeouts of functional tests

Posted by Pierrick Bouvier 1 year, 3 months ago

On 11/6/24 09:40, Thomas Huth wrote:
> On 06/11/2024 18.30, Pierrick Bouvier wrote:
>> On 11/6/24 09:26, Peter Maydell wrote:
>>> On Wed, 6 Nov 2024 at 17:21, Pierrick Bouvier
>>> <pierrick.bouvier@linaro.org> wrote:
>>>> I noticed by --enable-debug in configure is a combination of enabling
>>>> checks (enable-debug-tcg + graph + mutex), and deactivating optimizations.
>>>>
>>>> Would it be worth keeping the optimizations and runtime checks instead?
>>>> This way, there would be no more "timeout" issue.
>>>>
>>>> I'm not sure which added value we get from O0, except for debugging
>>>> locally QEMU.
>>>
>>> "Debugging locally QEMU" is exactly what --enable-debug is intended for...
>>>
>>
>> Yes...
>> but it seems like we take it for "enable debug checks" in CI as well and it
>> impacts runtime, because optimizations are deactivated. I think I've not
>> been the only one confused about this.
>>
>> So my point is that we should maybe differentiate the two use cases at
>> configure level.
>>
>> --enable-debug and
>> --enable-runtime-checks (or something more explicit)
> 
> Would that really help? I guess people still want to be able to run "make
> check" when they compiled with --enable-debug, so we still need to be
> prepared to run the checks with a slow QEMU.
> 

Makes sense, even though it seems to indicate we have a wrong default 
semantic here.

> But I wonder whether we could maybe use -Og instead of -O0 nowadays?
> 

It would not hurt, but I'm not sure it's enough to avoid hitting those 
timeout/perf difference issues.

>    Thomas
>