Increase timeout for tests and syntax-check

[PATCH] Increase timeout for tests and syntax-check

Posted by Michal Privoznik 3 years, 2 months ago

Since we've switched to meson our tests run with a timeout (meson
uses 30 seconds as the default). However, not every machine that
builds libvirt is fast enough to run every test under 30 seconds
(each test binary has its own timeout, but still). For instance
when building a package for distro on a farm that's under load.
Or on a generally slow ARM hardware. While each developer can
tune their command line for building by adding
--timeout-multiplier=10, this is hard to do for aforementioned
build farms.

It's time to admit that not everybody has the latest, top shelf
CPU and increase the timeout.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---

Successfully tested on my RasPi 2B, where the top three longest test
were:

  qemuxml2argvtest              OK  142.43s
  virschematest                 OK  40.82s
  virnettlssessiontest          OK  38.64s

and syntax-check:

 sc_spacing-check               OK  50.26
 sc_avoid_if_before_free        OK  42.30s
 sc_prohibit_cross_inclusion    OK  40.51s

 build-aux/meson.build |  1 +
 tests/meson.build     | 14 ++++++--------
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/build-aux/meson.build b/build-aux/meson.build
index c506feefd2..e7b2db8759 100644
--- a/build-aux/meson.build
+++ b/build-aux/meson.build
@@ -44,6 +44,7 @@ if git
         potfiles_dep,
       ],
       suite: 'syntax-check',
+      timeout: 120,
     )
   endforeach
 endif
diff --git a/tests/meson.build b/tests/meson.build
index 0de0783839..702090d594 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -546,6 +546,10 @@ if conf.has('WITH_YAJL')
   ]
 endif
 
+# Increase the default timeout because some tests may run longer,
+# esp. on slower systems.
+timeout = 180
+
 foreach data : tests
   test_sources = '@0@.c'.format(data['name'])
   test_bin = executable(
@@ -577,12 +581,6 @@ foreach data : tests
     ],
     export_dynamic: true,
   )
-  if data['name'] == 'qemuxml2argvtest'
-    timeout = 90
-  else
-    # default meson timeout
-    timeout = 30
-  endif
   test(data['name'], test_bin, env: tests_env, timeout: timeout)
 endforeach
 
@@ -683,7 +681,7 @@ endif
 
 foreach name : test_scripts
   script = find_program(name)
-  test(name, script, env: tests_env)
+  test(name, script, env: tests_env, timeout: timeout)
 endforeach
 
 add_test_setup(
@@ -703,6 +701,6 @@ add_test_setup(
     '--suppressions=@0@'.format(meson.current_source_dir() / '.valgrind.supp'),
     '--error-exitcode=1',
   ],
-  # default timeout in meson is 30s
+  # Tests take a lot longer when run under Valgrind
   timeout_multiplier: 4,
 )
-- 
2.26.2

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel Henrique Barboza 3 years, 2 months ago

On 1/27/21 2:59 PM, Michal Privoznik wrote:
> Since we've switched to meson our tests run with a timeout (meson
> uses 30 seconds as the default). However, not every machine that
> builds libvirt is fast enough to run every test under 30 seconds
> (each test binary has its own timeout, but still). For instance
> when building a package for distro on a farm that's under load.
> Or on a generally slow ARM hardware. While each developer can
> tune their command line for building by adding
> --timeout-multiplier=10, this is hard to do for aforementioned
> build farms.
> 
> It's time to admit that not everybody has the latest, top shelf
> CPU and increase the timeout.
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---

This sure will help these build farms environments, but what about the cases
where an actual timeout means that there is something wrong with the code?
E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
seeing tests timing out in Power hosts when the 30 sec timeout was being
enforced.

A 120 second default timeout for the majority of the test cases is a long time.
virschematest in this laptop I use takes 2.5 sec to complete. If I do something
wrong in the code and now the test is now 4 times slower (10 sec) I will not be
able to detect it (I'll need to start keeping track or something). You'll have to
run the test suit on your RasPi 2B to see that something went wrong because the
timeout is better tuned to your RasPI than this laptop, but then the code is already
upstream.

And the tests will get more complex and will naturally take longer to complete.
Eventually this timeout might no be enough. Increase the timeout again?

Meson 0.57 has support for disabling test timeout entirely with --timeout-multiplier=0,
disabling test timeout entirely. Can't we bump the meson version to 0.57 and
then tell the distros to use timeout-multiplier=0? That will solve the problems
for them, I keep the short timeouts for development, and we won't need to keep
bumping the test timeout once every 2 years or something because the s390x
TCG enviroment in Fedora COPR is timing out again.

Thanks,

DHB

> 
> Successfully tested on my RasPi 2B, where the top three longest test
> were:
> 
>    qemuxml2argvtest              OK  142.43s
>    virschematest                 OK  40.82s
>    virnettlssessiontest          OK  38.64s
> 
> and syntax-check:
> 
>   sc_spacing-check               OK  50.26
>   sc_avoid_if_before_free        OK  42.30s
>   sc_prohibit_cross_inclusion    OK  40.51s
> 
>   build-aux/meson.build |  1 +
>   tests/meson.build     | 14 ++++++--------
>   2 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/build-aux/meson.build b/build-aux/meson.build
> index c506feefd2..e7b2db8759 100644
> --- a/build-aux/meson.build
> +++ b/build-aux/meson.build
> @@ -44,6 +44,7 @@ if git
>           potfiles_dep,
>         ],
>         suite: 'syntax-check',
> +      timeout: 120,
>       )
>     endforeach
>   endif
> diff --git a/tests/meson.build b/tests/meson.build
> index 0de0783839..702090d594 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -546,6 +546,10 @@ if conf.has('WITH_YAJL')
>     ]
>   endif
>   
> +# Increase the default timeout because some tests may run longer,
> +# esp. on slower systems.
> +timeout = 180
> +
>   foreach data : tests
>     test_sources = '@0@.c'.format(data['name'])
>     test_bin = executable(
> @@ -577,12 +581,6 @@ foreach data : tests
>       ],
>       export_dynamic: true,
>     )
> -  if data['name'] == 'qemuxml2argvtest'
> -    timeout = 90
> -  else
> -    # default meson timeout
> -    timeout = 30
> -  endif
>     test(data['name'], test_bin, env: tests_env, timeout: timeout)
>   endforeach
>   
> @@ -683,7 +681,7 @@ endif
>   
>   foreach name : test_scripts
>     script = find_program(name)
> -  test(name, script, env: tests_env)
> +  test(name, script, env: tests_env, timeout: timeout)
>   endforeach
>   
>   add_test_setup(
> @@ -703,6 +701,6 @@ add_test_setup(
>       '--suppressions=@0@'.format(meson.current_source_dir() / '.valgrind.supp'),
>       '--error-exitcode=1',
>     ],
> -  # default timeout in meson is 30s
> +  # Tests take a lot longer when run under Valgrind
>     timeout_multiplier: 4,
>   )
>

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Pavel Hrdina 3 years, 2 months ago

On Fri, Jan 29, 2021 at 09:30:24AM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 1/27/21 2:59 PM, Michal Privoznik wrote:
> > Since we've switched to meson our tests run with a timeout (meson
> > uses 30 seconds as the default). However, not every machine that
> > builds libvirt is fast enough to run every test under 30 seconds
> > (each test binary has its own timeout, but still). For instance
> > when building a package for distro on a farm that's under load.
> > Or on a generally slow ARM hardware. While each developer can
> > tune their command line for building by adding
> > --timeout-multiplier=10, this is hard to do for aforementioned
> > build farms.
> > 
> > It's time to admit that not everybody has the latest, top shelf
> > CPU and increase the timeout.
> > 
> > Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> > ---
> 
> This sure will help these build farms environments, but what about the cases
> where an actual timeout means that there is something wrong with the code?
> E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
> seeing tests timing out in Power hosts when the 30 sec timeout was being
> enforced.
> 
> A 120 second default timeout for the majority of the test cases is a long time.
> virschematest in this laptop I use takes 2.5 sec to complete. If I do something
> wrong in the code and now the test is now 4 times slower (10 sec) I will not be
> able to detect it (I'll need to start keeping track or something). You'll have to
> run the test suit on your RasPi 2B to see that something went wrong because the
> timeout is better tuned to your RasPI than this laptop, but then the code is already
> upstream.
> 
> And the tests will get more complex and will naturally take longer to complete.
> Eventually this timeout might no be enough. Increase the timeout again?
> 
> Meson 0.57 has support for disabling test timeout entirely with --timeout-multiplier=0,
> disabling test timeout entirely. Can't we bump the meson version to 0.57 and
> then tell the distros to use timeout-multiplier=0? That will solve the problems
> for them, I keep the short timeouts for development, and we won't need to keep
> bumping the test timeout once every 2 years or something because the s390x
> TCG enviroment in Fedora COPR is timing out again.

Agreed. The timeout is useful and we can always find a machine where the
current timeout will not be good enough. I don't think it is a good idea
to increase the timeout like this for only few specific use-cases.

Bumping meson version to 0.57 just only for this tiny nice to have
feature is not something we should do. It would make building libvirt
unnecessary complicated for most contributors not using latest
bleeding-edge software. For example meson 0.57 is not even in
Fedora Rawhide, Gentoo, Archlinux and possibly on most of the similar
distributions.

Pavel

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Michal Privoznik 3 years, 2 months ago

On 1/29/21 1:45 PM, Pavel Hrdina wrote:
> On Fri, Jan 29, 2021 at 09:30:24AM -0300, Daniel Henrique Barboza wrote:
>>
>>
>> On 1/27/21 2:59 PM, Michal Privoznik wrote:
>>> Since we've switched to meson our tests run with a timeout (meson
>>> uses 30 seconds as the default). However, not every machine that
>>> builds libvirt is fast enough to run every test under 30 seconds
>>> (each test binary has its own timeout, but still). For instance
>>> when building a package for distro on a farm that's under load.
>>> Or on a generally slow ARM hardware. While each developer can
>>> tune their command line for building by adding
>>> --timeout-multiplier=10, this is hard to do for aforementioned
>>> build farms.
>>>
>>> It's time to admit that not everybody has the latest, top shelf
>>> CPU and increase the timeout.
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>>> ---
>>
>> This sure will help these build farms environments, but what about the cases
>> where an actual timeout means that there is something wrong with the code?
>> E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
>> seeing tests timing out in Power hosts when the 30 sec timeout was being
>> enforced.
>>
>> A 120 second default timeout for the majority of the test cases is a long time.
>> virschematest in this laptop I use takes 2.5 sec to complete. If I do something
>> wrong in the code and now the test is now 4 times slower (10 sec) I will not be
>> able to detect it (I'll need to start keeping track or something). You'll have to
>> run the test suit on your RasPi 2B to see that something went wrong because the
>> timeout is better tuned to your RasPI than this laptop, but then the code is already
>> upstream.
>>
>> And the tests will get more complex and will naturally take longer to complete.
>> Eventually this timeout might no be enough. Increase the timeout again?
>>
>> Meson 0.57 has support for disabling test timeout entirely with --timeout-multiplier=0,
>> disabling test timeout entirely. Can't we bump the meson version to 0.57 and
>> then tell the distros to use timeout-multiplier=0? That will solve the problems
>> for them, I keep the short timeouts for development, and we won't need to keep
>> bumping the test timeout once every 2 years or something because the s390x
>> TCG enviroment in Fedora COPR is timing out again.
> 
> Agreed. The timeout is useful and we can always find a machine where the
> current timeout will not be good enough. I don't think it is a good idea
> to increase the timeout like this for only few specific use-cases.

So we agree that timeouts are evil :-) Back in the autotools days we did 
not use any timeout at all and I don't remember that causing any 
trouble. But maybe it was so painful that I blocked it in my mind :-)

Michal

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Pavel Hrdina 3 years, 2 months ago

On Fri, Jan 29, 2021 at 02:07:31PM +0100, Michal Privoznik wrote:
> On 1/29/21 1:45 PM, Pavel Hrdina wrote:
> > On Fri, Jan 29, 2021 at 09:30:24AM -0300, Daniel Henrique Barboza wrote:
> > > 
> > > 
> > > On 1/27/21 2:59 PM, Michal Privoznik wrote:
> > > > Since we've switched to meson our tests run with a timeout (meson
> > > > uses 30 seconds as the default). However, not every machine that
> > > > builds libvirt is fast enough to run every test under 30 seconds
> > > > (each test binary has its own timeout, but still). For instance
> > > > when building a package for distro on a farm that's under load.
> > > > Or on a generally slow ARM hardware. While each developer can
> > > > tune their command line for building by adding
> > > > --timeout-multiplier=10, this is hard to do for aforementioned
> > > > build farms.
> > > > 
> > > > It's time to admit that not everybody has the latest, top shelf
> > > > CPU and increase the timeout.
> > > > 
> > > > Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> > > > ---
> > > 
> > > This sure will help these build farms environments, but what about the cases
> > > where an actual timeout means that there is something wrong with the code?
> > > E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
> > > seeing tests timing out in Power hosts when the 30 sec timeout was being
> > > enforced.
> > > 
> > > A 120 second default timeout for the majority of the test cases is a long time.
> > > virschematest in this laptop I use takes 2.5 sec to complete. If I do something
> > > wrong in the code and now the test is now 4 times slower (10 sec) I will not be
> > > able to detect it (I'll need to start keeping track or something). You'll have to
> > > run the test suit on your RasPi 2B to see that something went wrong because the
> > > timeout is better tuned to your RasPI than this laptop, but then the code is already
> > > upstream.
> > > 
> > > And the tests will get more complex and will naturally take longer to complete.
> > > Eventually this timeout might no be enough. Increase the timeout again?
> > > 
> > > Meson 0.57 has support for disabling test timeout entirely with --timeout-multiplier=0,
> > > disabling test timeout entirely. Can't we bump the meson version to 0.57 and
> > > then tell the distros to use timeout-multiplier=0? That will solve the problems
> > > for them, I keep the short timeouts for development, and we won't need to keep
> > > bumping the test timeout once every 2 years or something because the s390x
> > > TCG enviroment in Fedora COPR is timing out again.
> > 
> > Agreed. The timeout is useful and we can always find a machine where the
> > current timeout will not be good enough. I don't think it is a good idea
> > to increase the timeout like this for only few specific use-cases.
> 
> So we agree that timeouts are evil :-) Back in the autotools days we did not

No we don't :) timeouts are useful in test suites to reveal issues that
something takes a long time when it shouldn't.

Pavel

> use any timeout at all and I don't remember that causing any trouble. But
> maybe it was so painful that I blocked it in my mind :-)
> 
> Michal

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Michal Privoznik 3 years, 2 months ago

On 1/29/21 2:25 PM, Pavel Hrdina wrote:
> On Fri, Jan 29, 2021 at 02:07:31PM +0100, Michal Privoznik wrote:
>> On 1/29/21 1:45 PM, Pavel Hrdina wrote:
>>> On Fri, Jan 29, 2021 at 09:30:24AM -0300, Daniel Henrique Barboza wrote:
>>>>

>> So we agree that timeouts are evil :-) Back in the autotools days we did not
> 
> No we don't :) timeouts are useful in test suites to reveal issues that
> something takes a long time when it shouldn't.
I view timeout as a prevention mechanism for a test that went into an 
endless loop. If we really care about timing then there are way better 
techniques than meson's test timeout.

Or put differently, I don't record times for individual test cases and I 
would not notice if a test doubles its run time.

Michal

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel P. Berrangé 3 years, 2 months ago

On Fri, Jan 29, 2021 at 09:30:24AM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 1/27/21 2:59 PM, Michal Privoznik wrote:
> > Since we've switched to meson our tests run with a timeout (meson
> > uses 30 seconds as the default). However, not every machine that
> > builds libvirt is fast enough to run every test under 30 seconds
> > (each test binary has its own timeout, but still). For instance
> > when building a package for distro on a farm that's under load.
> > Or on a generally slow ARM hardware. While each developer can
> > tune their command line for building by adding
> > --timeout-multiplier=10, this is hard to do for aforementioned
> > build farms.
> > 
> > It's time to admit that not everybody has the latest, top shelf
> > CPU and increase the timeout.
> > 
> > Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> > ---
> 
> This sure will help these build farms environments, but what about the cases
> where an actual timeout means that there is something wrong with the code?
> E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
> seeing tests timing out in Power hosts when the 30 sec timeout was being
> enforced.
> 
> A 120 second default timeout for the majority of the test cases is a long time.
> virschematest in this laptop I use takes 2.5 sec to complete. If I do something
> wrong in the code and now the test is now 4 times slower (10 sec) I will not be
> able to detect it (I'll need to start keeping track or something). You'll have to
> run the test suit on your RasPi 2B to see that something went wrong because the
> timeout is better tuned to your RasPI than this laptop, but then the code is already
> upstream.
> 
> And the tests will get more complex and will naturally take longer to complete.
> Eventually this timeout might no be enough. Increase the timeout again?
> 
> Meson 0.57 has support for disabling test timeout entirely with --timeout-multiplier=0,
> disabling test timeout entirely. Can't we bump the meson version to 0.57 and
> then tell the distros to use timeout-multiplier=0? That will solve the problems
> for them, I keep the short timeouts for development, and we won't need to keep
> bumping the test timeout once every 2 years or something because the s390x
> TCG enviroment in Fedora COPR is timing out again.

We don't even need to bump the meson versionin libvirt. If someone ones the
functionality from 0.57, then can simply install that version of meson.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Michal Privoznik 3 years, 2 months ago

On 1/29/21 1:30 PM, Daniel Henrique Barboza wrote:
> 
> 
> On 1/27/21 2:59 PM, Michal Privoznik wrote:
>> Since we've switched to meson our tests run with a timeout (meson
>> uses 30 seconds as the default). However, not every machine that
>> builds libvirt is fast enough to run every test under 30 seconds
>> (each test binary has its own timeout, but still). For instance
>> when building a package for distro on a farm that's under load.
>> Or on a generally slow ARM hardware. While each developer can
>> tune their command line for building by adding
>> --timeout-multiplier=10, this is hard to do for aforementioned
>> build farms.
>>
>> It's time to admit that not everybody has the latest, top shelf
>> CPU and increase the timeout.
>>
>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>> ---
> 
> This sure will help these build farms environments, but what about the 
> cases
> where an actual timeout means that there is something wrong with the code?
> E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
> seeing tests timing out in Power hosts when the 30 sec timeout was being
> enforced.
> 
> A 120 second default timeout for the majority of the test cases is a 
> long time.
> virschematest in this laptop I use takes 2.5 sec to complete. If I do 
> something
> wrong in the code and now the test is now 4 times slower (10 sec) I will 
> not be
> able to detect it (I'll need to start keeping track or something). 

With 30 second timeout you won't detect that either. Using timeout as an 
indicator of test failure is wrong IMO. And if I were not lazy and fixed 
'check-access' test suite then we would see instantly what tests are 
accessing paths in the host (=> depend on host configuration).

> You'll have to
> run the test suit on your RasPi 2B to see that something went wrong 
> because the
> timeout is better tuned to your RasPI than this laptop, but then the 
> code is already
> upstream.

So should we make timeouts shorter then? Why is 30 seconds sweet spot?

> 
> And the tests will get more complex and will naturally take longer to 
> complete.
> Eventually this timeout might no be enough. Increase the timeout again?

Sure, why not? We adapt to newer gcc/clang/$whatever, why not timeout?

Michal

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel Henrique Barboza 3 years, 2 months ago

On 1/29/21 10:04 AM, Michal Privoznik wrote:
> On 1/29/21 1:30 PM, Daniel Henrique Barboza wrote:
>>
>>
>> On 1/27/21 2:59 PM, Michal Privoznik wrote:
>>> Since we've switched to meson our tests run with a timeout (meson
>>> uses 30 seconds as the default). However, not every machine that
>>> builds libvirt is fast enough to run every test under 30 seconds
>>> (each test binary has its own timeout, but still). For instance
>>> when building a package for distro on a farm that's under load.
>>> Or on a generally slow ARM hardware. While each developer can
>>> tune their command line for building by adding
>>> --timeout-multiplier=10, this is hard to do for aforementioned
>>> build farms.
>>>
>>> It's time to admit that not everybody has the latest, top shelf
>>> CPU and increase the timeout.
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>>> ---
>>
>> This sure will help these build farms environments, but what about the cases
>> where an actual timeout means that there is something wrong with the code?
>> E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
>> seeing tests timing out in Power hosts when the 30 sec timeout was being
>> enforced.
>>
>> A 120 second default timeout for the majority of the test cases is a long time.
>> virschematest in this laptop I use takes 2.5 sec to complete. If I do something
>> wrong in the code and now the test is now 4 times slower (10 sec) I will not be
>> able to detect it (I'll need to start keeping track or something). 
> 
> With 30 second timeout you won't detect that either. Using timeout as an indicator of test failure is wrong IMO. And if I were not lazy and fixed 'check-access' test suite then we would see instantly what tests are accessing paths in the host (=> depend on host configuration).

30 sec is too long for most tests :)

Let's put it this way: I don't think most of us keeps track of how much a
certain test takes to complete during our dev, and the 30 sec timeout is
a marker to see if we messed up or not. A 120 timeout is too extreme for
my dev env (and I believe most if not all of us can run the test suit without
any problems during development).

All that said, I just ran 'ninja -C build test' and verified that we don't provide
a total run time for all tests. We provide the time taken for all 306 tests, but
not a total. If we change the script to output the total time taken to run all
tests in the test suit, in a successful run (i.e. no failed tests), then I wouldn't
mind the timeout increase. I can run the test suit in master, get the total time
taken, do some coding, run again, compare the new total. This comparison will give
me a hint of whether something went too wrong, then I can compare the numbers myself
to see what happened. In this case I wouldn't mind removing all test timeouts or
increasing the timeout to help the distros or what have you.

Thanks,

DHB

> 
>> You'll have to
>> run the test suit on your RasPi 2B to see that something went wrong because the
>> timeout is better tuned to your RasPI than this laptop, but then the code is already
>> upstream.
> 
> So should we make timeouts shorter then? Why is 30 seconds sweet spot?
> 
>>
>> And the tests will get more complex and will naturally take longer to complete.
>> Eventually this timeout might no be enough. Increase the timeout again?
> 
> Sure, why not? We adapt to newer gcc/clang/$whatever, why not timeout?
> 
> Michal
>

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel P. Berrangé 3 years, 2 months ago

On Wed, Jan 27, 2021 at 06:59:58PM +0100, Michal Privoznik wrote:
> Since we've switched to meson our tests run with a timeout (meson
> uses 30 seconds as the default). However, not every machine that
> builds libvirt is fast enough to run every test under 30 seconds
> (each test binary has its own timeout, but still). For instance
> when building a package for distro on a farm that's under load.
> Or on a generally slow ARM hardware. While each developer can
> tune their command line for building by adding
> --timeout-multiplier=10, this is hard to do for aforementioned
> build farms.

I don't get why it is hard for build farms. Someone, somwhere
is writing the script that invokes meson & ninja with some
args. Why is it hard to add --timeout-multiplier=10 too ?

> It's time to admit that not everybody has the latest, top shelf
> CPU and increase the timeout.

I'm not convinced we want to optimize for the slowest hardware
we can find, especially when there's an easy option of setting
--timeout-multiplier=10.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Andrea Bolognani 3 years, 2 months ago

On Fri, 2021-01-29 at 12:48 +0000, Daniel P. Berrangé wrote:
> On Wed, Jan 27, 2021 at 06:59:58PM +0100, Michal Privoznik wrote:
> > Since we've switched to meson our tests run with a timeout (meson
> > uses 30 seconds as the default). However, not every machine that
> > builds libvirt is fast enough to run every test under 30 seconds
> > (each test binary has its own timeout, but still). For instance
> > when building a package for distro on a farm that's under load.
> > Or on a generally slow ARM hardware. While each developer can
> > tune their command line for building by adding
> > --timeout-multiplier=10, this is hard to do for aforementioned
> > build farms.
> 
> I don't get why it is hard for build farms. Someone, somwhere
> is writing the script that invokes meson & ninja with some
> args. Why is it hard to add --timeout-multiplier=10 too ?
> 
> > It's time to admit that not everybody has the latest, top shelf
> > CPU and increase the timeout.
> 
> I'm not convinced we want to optimize for the slowest hardware
> we can find, especially when there's an easy option of setting
> --timeout-multiplier=10.

It's not complicated to add the option, but the fact that Debian,
SUSE and now Fedora all need to specify a timeout multiplier hints to
the fact that perhaps the default timeout is just too small.

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel P. Berrangé 3 years, 2 months ago

On Fri, Jan 29, 2021 at 03:46:18PM +0100, Andrea Bolognani wrote:
> On Fri, 2021-01-29 at 12:48 +0000, Daniel P. Berrangé wrote:
> > On Wed, Jan 27, 2021 at 06:59:58PM +0100, Michal Privoznik wrote:
> > > Since we've switched to meson our tests run with a timeout (meson
> > > uses 30 seconds as the default). However, not every machine that
> > > builds libvirt is fast enough to run every test under 30 seconds
> > > (each test binary has its own timeout, but still). For instance
> > > when building a package for distro on a farm that's under load.
> > > Or on a generally slow ARM hardware. While each developer can
> > > tune their command line for building by adding
> > > --timeout-multiplier=10, this is hard to do for aforementioned
> > > build farms.
> > 
> > I don't get why it is hard for build farms. Someone, somwhere
> > is writing the script that invokes meson & ninja with some
> > args. Why is it hard to add --timeout-multiplier=10 too ?
> > 
> > > It's time to admit that not everybody has the latest, top shelf
> > > CPU and increase the timeout.
> > 
> > I'm not convinced we want to optimize for the slowest hardware
> > we can find, especially when there's an easy option of setting
> > --timeout-multiplier=10.
> 
> It's not complicated to add the option, but the fact that Debian,
> SUSE and now Fedora all need to specify a timeout multiplier hints to
> the fact that perhaps the default timeout is just too small.

AFAIK, Fedora hasn't set any timeout multiplier in our builds.

Some specific tests have an increased timeout, but that's reasonable
because they are very big tests, and that's a targetted increase.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Michal Privoznik 3 years, 2 months ago

On 1/29/21 3:53 PM, Daniel P. Berrangé wrote:
> On Fri, Jan 29, 2021 at 03:46:18PM +0100, Andrea Bolognani wrote:
>> On Fri, 2021-01-29 at 12:48 +0000, Daniel P. Berrangé wrote:
>>> On Wed, Jan 27, 2021 at 06:59:58PM +0100, Michal Privoznik wrote:
>>>> Since we've switched to meson our tests run with a timeout (meson
>>>> uses 30 seconds as the default). However, not every machine that
>>>> builds libvirt is fast enough to run every test under 30 seconds
>>>> (each test binary has its own timeout, but still). For instance
>>>> when building a package for distro on a farm that's under load.
>>>> Or on a generally slow ARM hardware. While each developer can
>>>> tune their command line for building by adding
>>>> --timeout-multiplier=10, this is hard to do for aforementioned
>>>> build farms.
>>>
>>> I don't get why it is hard for build farms. Someone, somwhere
>>> is writing the script that invokes meson & ninja with some
>>> args. Why is it hard to add --timeout-multiplier=10 too ?
>>>
>>>> It's time to admit that not everybody has the latest, top shelf
>>>> CPU and increase the timeout.
>>>
>>> I'm not convinced we want to optimize for the slowest hardware
>>> we can find, especially when there's an easy option of setting
>>> --timeout-multiplier=10.
>>
>> It's not complicated to add the option, but the fact that Debian,
>> SUSE and now Fedora all need to specify a timeout multiplier hints to
>> the fact that perhaps the default timeout is just too small.

It's an arbitrary number that was chosen outside of libvirt devel 
community and IIUC can change anytime meson devels please. So 
effectively the only control we have is this --timeout-multiplier which 
is still not guaranteed to yield correct results (e.g. if I now use 
multiplier of value 10 to allow 300s timeout but the default timeout is 
then changed to 20s, all of a sudfen I have 200s timeout).

> 
> AFAIK, Fedora hasn't set any timeout multiplier in our builds.

Only because Cole did not merge it, because I came up with idea for this 
patch. Here's Cole's original patch:

https://www.redhat.com/archives/libvir-list/2021-January/msg00919.html

But okay, looks like we don't have an agreement here so I won't push 
this patch. But Cole should still push his.

Michal

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel P. Berrangé 3 years, 2 months ago

On Fri, Jan 29, 2021 at 04:14:30PM +0100, Michal Privoznik wrote:
> On 1/29/21 3:53 PM, Daniel P. Berrangé wrote:
> > On Fri, Jan 29, 2021 at 03:46:18PM +0100, Andrea Bolognani wrote:
> > > On Fri, 2021-01-29 at 12:48 +0000, Daniel P. Berrangé wrote:
> > > > On Wed, Jan 27, 2021 at 06:59:58PM +0100, Michal Privoznik wrote:
> > > > > Since we've switched to meson our tests run with a timeout (meson
> > > > > uses 30 seconds as the default). However, not every machine that
> > > > > builds libvirt is fast enough to run every test under 30 seconds
> > > > > (each test binary has its own timeout, but still). For instance
> > > > > when building a package for distro on a farm that's under load.
> > > > > Or on a generally slow ARM hardware. While each developer can
> > > > > tune their command line for building by adding
> > > > > --timeout-multiplier=10, this is hard to do for aforementioned
> > > > > build farms.
> > > > 
> > > > I don't get why it is hard for build farms. Someone, somwhere
> > > > is writing the script that invokes meson & ninja with some
> > > > args. Why is it hard to add --timeout-multiplier=10 too ?
> > > > 
> > > > > It's time to admit that not everybody has the latest, top shelf
> > > > > CPU and increase the timeout.
> > > > 
> > > > I'm not convinced we want to optimize for the slowest hardware
> > > > we can find, especially when there's an easy option of setting
> > > > --timeout-multiplier=10.
> > > 
> > > It's not complicated to add the option, but the fact that Debian,
> > > SUSE and now Fedora all need to specify a timeout multiplier hints to
> > > the fact that perhaps the default timeout is just too small.
> 
> It's an arbitrary number that was chosen outside of libvirt devel community
> and IIUC can change anytime meson devels please. So effectively the only
> control we have is this --timeout-multiplier which is still not guaranteed
> to yield correct results (e.g. if I now use multiplier of value 10 to allow
> 300s timeout but the default timeout is then changed to 20s, all of a sudfen
> I have 200s timeout).
> 
> > 
> > AFAIK, Fedora hasn't set any timeout multiplier in our builds.
> 
> Only because Cole did not merge it, because I came up with idea for this
> patch. Here's Cole's original patch:
> 
> https://www.redhat.com/archives/libvir-list/2021-January/msg00919.html
> 
> But okay, looks like we don't have an agreement here so I won't push this
> patch. But Cole should still push his.

Oh that's not a real Fedora build - that's using  COPR where s390 is
provided by QEMU TCG emulation. It is unsuprising that is slow as
treacle, and IMHO optimizing for that env is not desirable.

IOW, I'm not convinced we should merge Cole's either.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Andrea Bolognani 3 years, 2 months ago

On Fri, 2021-01-29 at 14:53 +0000, Daniel P. Berrangé wrote:
> On Fri, Jan 29, 2021 at 03:46:18PM +0100, Andrea Bolognani wrote:
> > On Fri, 2021-01-29 at 12:48 +0000, Daniel P. Berrangé wrote:
> > > I'm not convinced we want to optimize for the slowest hardware
> > > we can find, especially when there's an easy option of setting
> > > --timeout-multiplier=10.
> > 
> > It's not complicated to add the option, but the fact that Debian,
> > SUSE and now Fedora all need to specify a timeout multiplier hints to
> > the fact that perhaps the default timeout is just too small.
> 
> AFAIK, Fedora hasn't set any timeout multiplier in our builds.

With

  commit 70307548d1622bfeffd60c86e62098d165c95074
  Author: Cole Robinson <crobinso@redhat.com>
  Date:   Thu Jan 21 16:54:45 2021 -0500

    spec: Increase meson test timeout 10x

    Tests time out when building in slow environments, like emulated
    s390x in Fedora copr. Bump up the test timeout

    Reviewed-by: Neal Gompa <ngompa13@gmail.com>
    Reviewed-by: Andrea Bolognani <abologna@redhat.com>
    Signed-off-by: Cole Robinson <crobinso@redhat.com>

the timeout multiplier has been added to the upstream spec file,
which the Fedora spec file is based on, so while you're correct that
Fedora is not yet using it, it's fair to assume it will soon make its
way there for the benefit of e.g. the virt-preview COPR.

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel P. Berrangé 3 years, 2 months ago

On Fri, Jan 29, 2021 at 04:20:53PM +0100, Andrea Bolognani wrote:
> On Fri, 2021-01-29 at 14:53 +0000, Daniel P. Berrangé wrote:
> > On Fri, Jan 29, 2021 at 03:46:18PM +0100, Andrea Bolognani wrote:
> > > On Fri, 2021-01-29 at 12:48 +0000, Daniel P. Berrangé wrote:
> > > > I'm not convinced we want to optimize for the slowest hardware
> > > > we can find, especially when there's an easy option of setting
> > > > --timeout-multiplier=10.
> > > 
> > > It's not complicated to add the option, but the fact that Debian,
> > > SUSE and now Fedora all need to specify a timeout multiplier hints to
> > > the fact that perhaps the default timeout is just too small.
> > 
> > AFAIK, Fedora hasn't set any timeout multiplier in our builds.
> 
> With
> 
>   commit 70307548d1622bfeffd60c86e62098d165c95074
>   Author: Cole Robinson <crobinso@redhat.com>
>   Date:   Thu Jan 21 16:54:45 2021 -0500
> 
>     spec: Increase meson test timeout 10x
> 
>     Tests time out when building in slow environments, like emulated
>     s390x in Fedora copr. Bump up the test timeout
> 
>     Reviewed-by: Neal Gompa <ngompa13@gmail.com>
>     Reviewed-by: Andrea Bolognani <abologna@redhat.com>
>     Signed-off-by: Cole Robinson <crobinso@redhat.com>
> 
> the timeout multiplier has been added to the upstream spec file,
> which the Fedora spec file is based on, so while you're correct that
> Fedora is not yet using it, it's fair to assume it will soon make its
> way there for the benefit of e.g. the virt-preview COPR.

I think this is the wrong solution. IMHO RPM should be making th
%meson_test macro include the timeout arg automatically when
running on a emulated environment. This would fix the problem for
all users of meson, without causing unecessarily long timeouts
for native builds.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Andrea Bolognani 3 years, 2 months ago

On Fri, 2021-01-29 at 15:28 +0000, Daniel P. Berrangé wrote:
> On Fri, Jan 29, 2021 at 04:20:53PM +0100, Andrea Bolognani wrote:
> > the timeout multiplier has been added to the upstream spec file,
> > which the Fedora spec file is based on, so while you're correct that
> > Fedora is not yet using it, it's fair to assume it will soon make its
> > way there for the benefit of e.g. the virt-preview COPR.
> 
> I think this is the wrong solution. IMHO RPM should be making th
> %meson_test macro include the timeout arg automatically when
> running on a emulated environment. This would fix the problem for
> all users of meson, without causing unecessarily long timeouts
> for native builds.

How likely is it that some broken test will take longer than 30
seconds and less than 5 minutes to run on your average developer's
laptop? My experience suggests tests either take way less than half a
minute to complete, or go on spinning forever.

So yeah, in the unlikely event that your changes have introduced an
infinite wait in one of the tests, it will take you a couple more
minutes to realize that; however, considering how fast the test suite
is under meson you'd probably get suspicious way before actually
hitting the timeout, and introducing such an issue is hopefully not a
very frequent occurrence anyway.

Note that emulated environments are not the only ones hitting this:
native builds on slow architectures failing because of the timeout
are the reason why I introduced the timeout multiplier in Debian, for
example.

Finally, not everyone uses RPMs, and even those who do might not want
to use "rpmbuild" as a substitute for "meson test" - especially not
on the kind of hardware where the default test timeout is a problem!

Raising the timeout at the meson level makes it possible for people
to just run "meson test" and be reasonably confident it will only
fail if there's an actual bug in libvirt.

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [PATCH] Increase timeout for tests and syntax-check

Posted by Daniel P. Berrangé 3 years, 2 months ago

On Fri, Jan 29, 2021 at 04:56:07PM +0100, Andrea Bolognani wrote:
> On Fri, 2021-01-29 at 15:28 +0000, Daniel P. Berrangé wrote:
> > On Fri, Jan 29, 2021 at 04:20:53PM +0100, Andrea Bolognani wrote:
> > > the timeout multiplier has been added to the upstream spec file,
> > > which the Fedora spec file is based on, so while you're correct that
> > > Fedora is not yet using it, it's fair to assume it will soon make its
> > > way there for the benefit of e.g. the virt-preview COPR.
> > 
> > I think this is the wrong solution. IMHO RPM should be making th
> > %meson_test macro include the timeout arg automatically when
> > running on a emulated environment. This would fix the problem for
> > all users of meson, without causing unecessarily long timeouts
> > for native builds.
> 
> How likely is it that some broken test will take longer than 30
> seconds and less than 5 minutes to run on your average developer's
> laptop? My experience suggests tests either take way less than half a
> minute to complete, or go on spinning forever.
> 
> So yeah, in the unlikely event that your changes have introduced an
> infinite wait in one of the tests, it will take you a couple more
> minutes to realize that; however, considering how fast the test suite
> is under meson you'd probably get suspicious way before actually
> hitting the timeout, and introducing such an issue is hopefully not a
> very frequent occurrence anyway.
> 
> Note that emulated environments are not the only ones hitting this:
> native builds on slow architectures failing because of the timeout
> are the reason why I introduced the timeout multiplier in Debian, for
> example.
> 
> Finally, not everyone uses RPMs, and even those who do might not want
> to use "rpmbuild" as a substitute for "meson test" - especially not
> on the kind of hardware where the default test timeout is a problem!
> 
> Raising the timeout at the meson level makes it possible for people
> to just run "meson test" and be reasonably confident it will only
> fail if there's an actual bug in libvirt.

If it is such a broad problem then the default should be raised in
meson itself rather than individual apps using meson.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|