[v1] mips: Add more Avocado tests

[Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Philippe Mathieu-Daudé 6 years, 8 months ago

Hi,

It was a rainy week-end here, so I invested it to automatize some
of my MIPS tests.

The BootLinuxSshTest is not Global warming friendly, it is not
meant to run on a CI system but rather on a workstation previous
to post a pull request.
It can surely be improved, but it is a good starting point.

Regards,

Phil.

Philippe Mathieu-Daudé (4):
  BootLinuxConsoleTest: Let extract_from_deb handle various compressions
  BootLinuxConsoleTest: Test nanoMIPS kernels on the I7200 CPU
  BootLinuxConsoleTest: Run kerneltests BusyBox on Malta
  BootLinuxSshTest: Test some userspace commands on Malta

 MAINTAINERS                              |   1 +
 tests/acceptance/boot_linux_console.py   | 112 ++++++++++-
 tests/acceptance/linux_ssh_mips_malta.py | 229 +++++++++++++++++++++++
 tests/requirements.txt                   |   1 +
 4 files changed, 341 insertions(+), 2 deletions(-)
 create mode 100644 tests/acceptance/linux_ssh_mips_malta.py

-- 
2.19.1

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Eduardo Habkost 6 years, 8 months ago

On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> It was a rainy week-end here, so I invested it to automatize some
> of my MIPS tests.
> 
> The BootLinuxSshTest is not Global warming friendly, it is not
> meant to run on a CI system but rather on a workstation previous
> to post a pull request.
> It can surely be improved, but it is a good starting point.

Until we actually have a mechanism to exclude the test case on
travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
please don't merge patch 4/4 yet or it will break travis-ci.

Cleber, Wainer, is it already possible to make "avocado run" skip
tests tagged with "slow"?

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Cleber Rosa 6 years, 8 months ago


----- Original Message -----
> From: "Eduardo Habkost" <ehabkost@redhat.com>
> To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>, "Aleksandar Markovic"
> <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta" <wainersm@redhat.com>
> Sent: Wednesday, May 22, 2019 5:12:30 PM
> Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> 
> On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > Hi,
> > 
> > It was a rainy week-end here, so I invested it to automatize some
> > of my MIPS tests.
> > 
> > The BootLinuxSshTest is not Global warming friendly, it is not
> > meant to run on a CI system but rather on a workstation previous
> > to post a pull request.
> > It can surely be improved, but it is a good starting point.
> 
> Until we actually have a mechanism to exclude the test case on
> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> please don't merge patch 4/4 yet or it will break travis-ci.
> 
> Cleber, Wainer, is it already possible to make "avocado run" skip
> tests tagged with "slow"?
> 

The mechanism exists, but we haven't tagged any test so far as slow.

Should we define/document a criteria for a test to be slow?  Given
that this is highly subjective, we have to think of:

 * Will we consider the average or maximum run time (the timeout
   definition)?
 
 * For a single test, what is "slow"? Some rough numbers from Travis
   CI[1] to help us with guidelines:
   - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
   - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
   - linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:  PASS (18.14 s)
   - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)

 * Do we want to set a maximum job timeout?  This way we can skip
   tests after a given amount of time has passed.  Currently we interrupt
   the test running when the job timeout is reached, but it's possible
   to add a option so that no new tests will be started, but currently
   running ones will be waited on.

Regards,
- Cleber. 

[1] - https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518

> --
> Eduardo
>

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Aleksandar Markovic 6 years, 8 months ago

On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
>
>
>
> ----- Original Message -----
> > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
"Aleksandar Markovic"
> > <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <
amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> > "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta" <
wainersm@redhat.com>
> > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> >
> > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > > Hi,
> > >
> > > It was a rainy week-end here, so I invested it to automatize some
> > > of my MIPS tests.
> > >
> > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > meant to run on a CI system but rather on a workstation previous
> > > to post a pull request.
> > > It can surely be improved, but it is a good starting point.
> >
> > Until we actually have a mechanism to exclude the test case on
> > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > please don't merge patch 4/4 yet or it will break travis-ci.
> >
> > Cleber, Wainer, is it already possible to make "avocado run" skip
> > tests tagged with "slow"?
> >
>
> The mechanism exists, but we haven't tagged any test so far as slow.
>
> Should we define/document a criteria for a test to be slow?  Given
> that this is highly subjective, we have to think of:
>
>  * Will we consider the average or maximum run time (the timeout
>    definition)?
>
>  * For a single test, what is "slow"? Some rough numbers from Travis
>    CI[1] to help us with guidelines:
>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
>    -
linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
PASS (18.14 s)
>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
>
>  * Do we want to set a maximum job timeout?  This way we can skip
>    tests after a given amount of time has passed.  Currently we interrupt
>    the test running when the job timeout is reached, but it's possible
>    to add a option so that no new tests will be started, but currently
>    running ones will be waited on.
>

As far as integrating the patch into my queue, I did it just an hour or so
prior the objections of others, but will inform Peter to put the pull
request on hold, so it will not go to the main tree.

We in Wave Computing (MIPS) are very happy with this test, even in its
current state, and understand it as an initial version that will be subject
to improvement and expansion. We consider this test seriously and think it
is vital for QEMU for MIPS.

We would like to put it in the “slow” group for the simple reason that, we
gather, this would give us more freedom in future versions.

Yours,
Aleksandar

> Regards,
> - Cleber.
>
> [1] - https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518
>
> > --
> > Eduardo
> >

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Aleksandar Markovic 6 years, 8 months ago

On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
>
>
>
> ----- Original Message -----
> > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
"Aleksandar Markovic"
> > <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <
amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> > "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta" <
wainersm@redhat.com>
> > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> >
> > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > > Hi,
> > >
> > > It was a rainy week-end here, so I invested it to automatize some
> > > of my MIPS tests.
> > >
> > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > meant to run on a CI system but rather on a workstation previous
> > > to post a pull request.
> > > It can surely be improved, but it is a good starting point.
> >
> > Until we actually have a mechanism to exclude the test case on
> > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > please don't merge patch 4/4 yet or it will break travis-ci.
> >
> > Cleber, Wainer, is it already possible to make "avocado run" skip
> > tests tagged with "slow"?
> >
>
> The mechanism exists, but we haven't tagged any test so far as slow.
>

Cleber,

For the test from patch 4/4, there is no dilemma - it should be in the
“slow” group, as Philippe envisioned and said, so that it is not humpered
with stricter requirements for “fast” (default) group. Could you explain us
how to do it, so that we can hopefully finally proceed?

Gratefully,
Aleksandar

> Should we define/document a criteria for a test to be slow?  Given
> that this is highly subjective, we have to think of:
>
>  * Will we consider the average or maximum run time (the timeout
>    definition)?
>
>  * For a single test, what is "slow"? Some rough numbers from Travis
>    CI[1] to help us with guidelines:
>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
>    -
linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
PASS (18.14 s)
>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
>
>  * Do we want to set a maximum job timeout?  This way we can skip
>    tests after a given amount of time has passed.  Currently we interrupt
>    the test running when the job timeout is reached, but it's possible
>    to add a option so that no new tests will be started, but currently
>    running ones will be waited on.
>
> Regards,
> - Cleber.
>
> [1] - https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518
>
> > --
> > Eduardo
> >

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Cleber Rosa 6 years, 8 months ago


----- Original Message -----
> From: "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>
> To: "Cleber Rosa" <crosa@redhat.com>
> Cc: "Wainer dos Santos Moschetta" <wainersm@redhat.com>, "Aleksandar Markovic" <amarkovic@wavecomp.com>,
> qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>, "Eduardo Habkost" <ehabkost@redhat.com>,
> "Aurelien Jarno" <aurelien@aurel32.net>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> Sent: Wednesday, May 22, 2019 6:43:54 PM
> Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> 
> On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> >
> >
> >
> > ----- Original Message -----
> > > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > > To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> > > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
> "Aleksandar Markovic"
> > > <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <
> amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> > > "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta" <
> wainersm@redhat.com>
> > > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > >
> > > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > > > Hi,
> > > >
> > > > It was a rainy week-end here, so I invested it to automatize some
> > > > of my MIPS tests.
> > > >
> > > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > > meant to run on a CI system but rather on a workstation previous
> > > > to post a pull request.
> > > > It can surely be improved, but it is a good starting point.
> > >
> > > Until we actually have a mechanism to exclude the test case on
> > > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > please don't merge patch 4/4 yet or it will break travis-ci.
> > >
> > > Cleber, Wainer, is it already possible to make "avocado run" skip
> > > tests tagged with "slow"?
> > >
> >
> > The mechanism exists, but we haven't tagged any test so far as slow.
> >
> 
> Cleber,
> 
> For the test from patch 4/4, there is no dilemma - it should be in the
> “slow” group, as Philippe envisioned and said, so that it is not humpered
> with stricter requirements for “fast” (default) group. Could you explain us
> how to do it, so that we can hopefully finally proceed?
> 

Hi Aleksandar,

The point is that there's no "group" definition at this point.  This is the
core of the discussion.

I think we're close to converging to something simple and effective.  Please
let us know what you think of the proposals given.

Thanks!
- Cleber.

> Gratefully,
> Aleksandar
> 
> > Should we define/document a criteria for a test to be slow?  Given
> > that this is highly subjective, we have to think of:
> >
> >  * Will we consider the average or maximum run time (the timeout
> >    definition)?
> >
> >  * For a single test, what is "slow"? Some rough numbers from Travis
> >    CI[1] to help us with guidelines:
> >    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
> >    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
> >    -
> linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> PASS (18.14 s)
> >    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> >
> >  * Do we want to set a maximum job timeout?  This way we can skip
> >    tests after a given amount of time has passed.  Currently we interrupt
> >    the test running when the job timeout is reached, but it's possible
> >    to add a option so that no new tests will be started, but currently
> >    running ones will be waited on.
> >
> > Regards,
> > - Cleber.
> >
> > [1] - https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518
> >
> > > --
> > > Eduardo
> > >
>

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Aleksandar Markovic 6 years, 8 months ago

On May 23, 2019 3:45 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
>
>
>
> ----- Original Message -----
> > From: "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>
> > To: "Cleber Rosa" <crosa@redhat.com>
> > Cc: "Wainer dos Santos Moschetta" <wainersm@redhat.com>, "Aleksandar
Markovic" <amarkovic@wavecomp.com>,
> > qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
"Eduardo Habkost" <ehabkost@redhat.com>,
> > "Aurelien Jarno" <aurelien@aurel32.net>, "Philippe Mathieu-Daudé" <
f4bug@amsat.org>
> > Sent: Wednesday, May 22, 2019 6:43:54 PM
> > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> >
> > On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> > >
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > > > To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> > > > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com
>,
> > "Aleksandar Markovic"
> > > > <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <
> > amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> > > > "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos
Moschetta" <
> > wainersm@redhat.com>
> > > > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > > >
> > > > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé
wrote:
> > > > > Hi,
> > > > >
> > > > > It was a rainy week-end here, so I invested it to automatize some
> > > > > of my MIPS tests.
> > > > >
> > > > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > > > meant to run on a CI system but rather on a workstation previous
> > > > > to post a pull request.
> > > > > It can surely be improved, but it is a good starting point.
> > > >
> > > > Until we actually have a mechanism to exclude the test case on
> > > > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > > please don't merge patch 4/4 yet or it will break travis-ci.
> > > >
> > > > Cleber, Wainer, is it already possible to make "avocado run" skip
> > > > tests tagged with "slow"?
> > > >
> > >
> > > The mechanism exists, but we haven't tagged any test so far as slow.
> > >
> >
> > Cleber,
> >
> > For the test from patch 4/4, there is no dilemma - it should be in the
> > “slow” group, as Philippe envisioned and said, so that it is not
humpered
> > with stricter requirements for “fast” (default) group. Could you
explain us
> > how to do it, so that we can hopefully finally proceed?
> >
>
> Hi Aleksandar,
>
> The point is that there's no "group" definition at this point.  This is
the
> core of the discussion.
>
> I think we're close to converging to something simple and effective.
Please
> let us know what you think of the proposals given.
>
> Thanks!
> - Cleber.
>

Cleber, hi.

Thanks for responding.

My views are very similar to Philippe's, but I will provide you with more
details of our (mips) perspective.

As far as black/whitelist issues that is a moot point for us - we only want
to be able to have a way to tag a test within the test itself (so, without
updating some common files, external lists,etc.)

In general, we would like to have a test environment where we would be able
to test what WE deem suitable for us, without feeling that we bother you or
anybody else, or that we are bothered by others.

Let me give you a little extreme example: Let's say we want a complex test
that downloads components from let's say fifty internet location, executes
zillion test cases, and last two days. I wouldn't like anybody to ask me
“Why would you that?” or tell me “You can't do this.” or say “No, we did
not anticipate such tests, patch rejected.” I (we, people from mips) should
be able to define what I (we) need.

Having such test would be a big deal for me, not only that I could run it
manually or automatically every weekend, but I could ask submitters of
critical changes: “Did you run this test that we have in Avocado dir?”,
without specifying test details, procedures, etc. All this is a BIG deal
for me.

On the other hand, I agree that certain group of tests (envisioned for
daily or so Travis CI) should have some stricter limitations and structure.
But right now I feel humpered by it, and this is counterproductive.

So, we want freedom, responsibility and ownersheep of our tests. Please
give us the opportunity to get down on business and start writing tests and
start testing.

Yours,
Aleksandar

> > Gratefully,
> > Aleksandar
> >
> > > Should we define/document a criteria for a test to be slow?  Given
> > > that this is highly subjective, we have to think of:
> > >
> > >  * Will we consider the average or maximum run time (the timeout
> > >    definition)?
> > >
> > >  * For a single test, what is "slow"? Some rough numbers from Travis
> > >    CI[1] to help us with guidelines:
> > >    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS
(6.04 s)
> > >    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS
(2.91 s)
> > >    -
> >
linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> > PASS (18.14 s)
> > >    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> > >
> > >  * Do we want to set a maximum job timeout?  This way we can skip
> > >    tests after a given amount of time has passed.  Currently we
interrupt
> > >    the test running when the job timeout is reached, but it's possible
> > >    to add a option so that no new tests will be started, but currently
> > >    running ones will be waited on.
> > >
> > > Regards,
> > > - Cleber.
> > >
> > > [1] - https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518
> > >
> > > > --
> > > > Eduardo
> > > >
> >

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Philippe Mathieu-Daudé 6 years, 8 months ago

On 5/23/19 7:11 PM, Aleksandar Markovic wrote:
> On May 23, 2019 3:45 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
>>> From: "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>
>>> On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
>>>>> From: "Eduardo Habkost" <ehabkost@redhat.com>
>>>>>
>>>>> Until we actually have a mechanism to exclude the test case on
>>>>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
>>>>> please don't merge patch 4/4 yet or it will break travis-ci.
>>>>>
>>>>> Cleber, Wainer, is it already possible to make "avocado run" skip
>>>>> tests tagged with "slow"?
>>>>>
>>>>
>>>> The mechanism exists, but we haven't tagged any test so far as slow.
>>>>
>>>
>>> Cleber,
>>>
>>> For the test from patch 4/4, there is no dilemma - it should be in the
>>> “slow” group, as Philippe envisioned and said, so that it is not
> humpered
>>> with stricter requirements for “fast” (default) group. Could you
> explain us
>>> how to do it, so that we can hopefully finally proceed?
>>>
>>
>> Hi Aleksandar,
>>
>> The point is that there's no "group" definition at this point.  This is
> the
>> core of the discussion.
>>
>> I think we're close to converging to something simple and effective.
> Please
>> let us know what you think of the proposals given.
>>
>> Thanks!
>> - Cleber.
>>
> 
> Cleber, hi.
> 
> Thanks for responding.
> 
> My views are very similar to Philippe's, but I will provide you with more
> details of our (mips) perspective.
> 
> As far as black/whitelist issues that is a moot point for us - we only want
> to be able to have a way to tag a test within the test itself (so, without
> updating some common files, external lists,etc.)
> 
> In general, we would like to have a test environment where we would be able
> to test what WE deem suitable for us, without feeling that we bother you or
> anybody else, or that we are bothered by others.
> 
> Let me give you a little extreme example: Let's say we want a complex test
> that downloads components from let's say fifty internet location, executes
> zillion test cases, and last two days. I wouldn't like anybody to ask me
> “Why would you that?” or tell me “You can't do this.” or say “No, we did
> not anticipate such tests, patch rejected.” I (we, people from mips) should
> be able to define what I (we) need.

Maybe we can use subdirectory like we do for the TCG tests (Aleksandar
maintains tests/tcg/mips/). We should try to keep contribution upstream,
so good idea/pattern can be reused by others.

What I'd like to have with those tests is, at least:

1/ we don't need to run all the tests (but there is a set of 'quick'
tests we can run on daily basis)

2/ maintainers can run their default tests easily (using a combination
of Avocado tags)

3/ if a developer working on the PCI subsystem has to modify the MIPS
subsystem (for example), he should be able to run the MIPS tests before
sending his series.

> Having such test would be a big deal for me, not only that I could run it
> manually or automatically every weekend, but I could ask submitters of
> critical changes: “Did you run this test that we have in Avocado dir?”,
> without specifying test details, procedures, etc. All this is a BIG deal
> for me.
> 
> On the other hand, I agree that certain group of tests (envisioned for
> daily or so Travis CI) should have some stricter limitations and structure.
> But right now I feel humpered by it, and this is counterproductive.
> 
> So, we want freedom, responsibility and ownersheep of our tests. Please
> give us the opportunity to get down on business and start writing tests and
> start testing.
> 
> Yours,
> Aleksandar

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Aleksandar Markovic 6 years, 8 months ago

On May 23, 2019 7:27 PM, "Philippe Mathieu-Daudé" <f4bug@amsat.org> wrote:
>
> On 5/23/19 7:11 PM, Aleksandar Markovic wrote:
> > On May 23, 2019 3:45 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> >>> From: "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>
> >>> On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> >>>>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> >>>>>
> >>>>> Until we actually have a mechanism to exclude the test case on
> >>>>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> >>>>> please don't merge patch 4/4 yet or it will break travis-ci.
> >>>>>
> >>>>> Cleber, Wainer, is it already possible to make "avocado run" skip
> >>>>> tests tagged with "slow"?
> >>>>>
> >>>>
> >>>> The mechanism exists, but we haven't tagged any test so far as slow.
> >>>>
> >>>
> >>> Cleber,
> >>>
> >>> For the test from patch 4/4, there is no dilemma - it should be in the
> >>> “slow” group, as Philippe envisioned and said, so that it is not
> > humpered
> >>> with stricter requirements for “fast” (default) group. Could you
> > explain us
> >>> how to do it, so that we can hopefully finally proceed?
> >>>
> >>
> >> Hi Aleksandar,
> >>
> >> The point is that there's no "group" definition at this point.  This is
> > the
> >> core of the discussion.
> >>
> >> I think we're close to converging to something simple and effective.
> > Please
> >> let us know what you think of the proposals given.
> >>
> >> Thanks!
> >> - Cleber.
> >>
> >
> > Cleber, hi.
> >
> > Thanks for responding.
> >
> > My views are very similar to Philippe's, but I will provide you with
more
> > details of our (mips) perspective.
> >
> > As far as black/whitelist issues that is a moot point for us - we only
want
> > to be able to have a way to tag a test within the test itself (so,
without
> > updating some common files, external lists,etc.)
> >
> > In general, we would like to have a test environment where we would be
able
> > to test what WE deem suitable for us, without feeling that we bother
you or
> > anybody else, or that we are bothered by others.
> >
> > Let me give you a little extreme example: Let's say we want a complex
test
> > that downloads components from let's say fifty internet location,
executes
> > zillion test cases, and last two days. I wouldn't like anybody to ask me
> > “Why would you that?” or tell me “You can't do this.” or say “No, we did
> > not anticipate such tests, patch rejected.” I (we, people from mips)
should
> > be able to define what I (we) need.
>
> Maybe we can use subdirectory like we do for the TCG tests (Aleksandar
> maintains tests/tcg/mips/). We should try to keep contribution upstream,
> so good idea/pattern can be reused by others.
>
> What I'd like to have with those tests is, at least:
>
> 1/ we don't need to run all the tests (but there is a set of 'quick'
> tests we can run on daily basis)
>
> 2/ maintainers can run their default tests easily (using a combination
> of Avocado tags)
>
> 3/ if a developer working on the PCI subsystem has to modify the MIPS
> subsystem (for example), he should be able to run the MIPS tests before
> sending his series.
>

Exactly! Excellent ideas and examples!

> > Having such test would be a big deal for me, not only that I could run
it
> > manually or automatically every weekend, but I could ask submitters of
> > critical changes: “Did you run this test that we have in Avocado dir?”,
> > without specifying test details, procedures, etc. All this is a BIG deal
> > for me.
> >
> > On the other hand, I agree that certain group of tests (envisioned for
> > daily or so Travis CI) should have some stricter limitations and
structure.
> > But right now I feel humpered by it, and this is counterproductive.
> >
> > So, we want freedom, responsibility and ownersheep of our tests. Please
> > give us the opportunity to get down on business and start writing tests
and
> > start testing.
> >
> > Yours,
> > Aleksandar

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Eduardo Habkost 6 years, 8 months ago

On Thu, May 23, 2019 at 07:27:35PM +0200, Philippe Mathieu-Daudé wrote:
> On 5/23/19 7:11 PM, Aleksandar Markovic wrote:
> > On May 23, 2019 3:45 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> >>> From: "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>
> >>> On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> >>>>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> >>>>>
> >>>>> Until we actually have a mechanism to exclude the test case on
> >>>>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> >>>>> please don't merge patch 4/4 yet or it will break travis-ci.
> >>>>>
> >>>>> Cleber, Wainer, is it already possible to make "avocado run" skip
> >>>>> tests tagged with "slow"?
> >>>>>
> >>>>
> >>>> The mechanism exists, but we haven't tagged any test so far as slow.
> >>>>
> >>>
> >>> Cleber,
> >>>
> >>> For the test from patch 4/4, there is no dilemma - it should be in the
> >>> “slow” group, as Philippe envisioned and said, so that it is not
> > humpered
> >>> with stricter requirements for “fast” (default) group. Could you
> > explain us
> >>> how to do it, so that we can hopefully finally proceed?
> >>>
> >>
> >> Hi Aleksandar,
> >>
> >> The point is that there's no "group" definition at this point.  This is
> > the
> >> core of the discussion.
> >>
> >> I think we're close to converging to something simple and effective.
> > Please
> >> let us know what you think of the proposals given.
> >>
> >> Thanks!
> >> - Cleber.
> >>
> > 
> > Cleber, hi.
> > 
> > Thanks for responding.
> > 
> > My views are very similar to Philippe's, but I will provide you with more
> > details of our (mips) perspective.
> > 
> > As far as black/whitelist issues that is a moot point for us - we only want
> > to be able to have a way to tag a test within the test itself (so, without
> > updating some common files, external lists,etc.)
> > 
> > In general, we would like to have a test environment where we would be able
> > to test what WE deem suitable for us, without feeling that we bother you or
> > anybody else, or that we are bothered by others.
> > 
> > Let me give you a little extreme example: Let's say we want a complex test
> > that downloads components from let's say fifty internet location, executes
> > zillion test cases, and last two days. I wouldn't like anybody to ask me
> > “Why would you that?” or tell me “You can't do this.” or say “No, we did
> > not anticipate such tests, patch rejected.” I (we, people from mips) should
> > be able to define what I (we) need.
> 
> Maybe we can use subdirectory like we do for the TCG tests (Aleksandar
> maintains tests/tcg/mips/). We should try to keep contribution upstream,
> so good idea/pattern can be reused by others.
> 
> What I'd like to have with those tests is, at least:
> 
> 1/ we don't need to run all the tests (but there is a set of 'quick'
> tests we can run on daily basis)
> 
> 2/ maintainers can run their default tests easily (using a combination
> of Avocado tags)
> 
> 3/ if a developer working on the PCI subsystem has to modify the MIPS
> subsystem (for example), he should be able to run the MIPS tests before
> sending his series.

Keeping the test cases organized in subdirectories are a good
idea, but don't think this is going to help us when we need to
quickly enable/disable specific test cases on some CI systems.

Disabling a test case (or an entire category of test cases) known
to be failing on some CI systems should require a one line patch,
not moving files to a separate directory.

> 
> > Having such test would be a big deal for me, not only that I could run it
> > manually or automatically every weekend, but I could ask submitters of
> > critical changes: “Did you run this test that we have in Avocado dir?”,
> > without specifying test details, procedures, etc. All this is a BIG deal
> > for me.
> > 
> > On the other hand, I agree that certain group of tests (envisioned for
> > daily or so Travis CI) should have some stricter limitations and structure.
> > But right now I feel humpered by it, and this is counterproductive.
> > 
> > So, we want freedom, responsibility and ownersheep of our tests. Please
> > give us the opportunity to get down on business and start writing tests and
> > start testing.
> > 
> > Yours,
> > Aleksandar

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Aleksandar Markovic 6 years, 8 months ago

On May 24, 2019 9:40 PM, "Eduardo Habkost" <ehabkost@redhat.com> wrote:
>
> On Thu, May 23, 2019 at 07:27:35PM +0200, Philippe Mathieu-Daudé wrote:
> > On 5/23/19 7:11 PM, Aleksandar Markovic wrote:
> > > On May 23, 2019 3:45 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> > >>> From: "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>
> > >>> On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> > >>>>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> > >>>>>
> > >>>>> Until we actually have a mechanism to exclude the test case on
> > >>>>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > >>>>> please don't merge patch 4/4 yet or it will break travis-ci.
> > >>>>>
> > >>>>> Cleber, Wainer, is it already possible to make "avocado run" skip
> > >>>>> tests tagged with "slow"?
> > >>>>>
> > >>>>
> > >>>> The mechanism exists, but we haven't tagged any test so far as
slow.
> > >>>>
> > >>>
> > >>> Cleber,
> > >>>
> > >>> For the test from patch 4/4, there is no dilemma - it should be in
the
> > >>> “slow” group, as Philippe envisioned and said, so that it is not
> > > humpered
> > >>> with stricter requirements for “fast” (default) group. Could you
> > > explain us
> > >>> how to do it, so that we can hopefully finally proceed?
> > >>>
> > >>
> > >> Hi Aleksandar,
> > >>
> > >> The point is that there's no "group" definition at this point.  This
is
> > > the
> > >> core of the discussion.
> > >>
> > >> I think we're close to converging to something simple and effective.
> > > Please
> > >> let us know what you think of the proposals given.
> > >>
> > >> Thanks!
> > >> - Cleber.
> > >>
> > >
> > > Cleber, hi.
> > >
> > > Thanks for responding.
> > >
> > > My views are very similar to Philippe's, but I will provide you with
more
> > > details of our (mips) perspective.
> > >
> > > As far as black/whitelist issues that is a moot point for us - we
only want
> > > to be able to have a way to tag a test within the test itself (so,
without
> > > updating some common files, external lists,etc.)
> > >
> > > In general, we would like to have a test environment where we would
be able
> > > to test what WE deem suitable for us, without feeling that we bother
you or
> > > anybody else, or that we are bothered by others.
> > >
> > > Let me give you a little extreme example: Let's say we want a complex
test
> > > that downloads components from let's say fifty internet location,
executes
> > > zillion test cases, and last two days. I wouldn't like anybody to ask
me
> > > “Why would you that?” or tell me “You can't do this.” or say “No, we
did
> > > not anticipate such tests, patch rejected.” I (we, people from mips)
should
> > > be able to define what I (we) need.
> >
> > Maybe we can use subdirectory like we do for the TCG tests (Aleksandar
> > maintains tests/tcg/mips/). We should try to keep contribution upstream,
> > so good idea/pattern can be reused by others.
> >
> > What I'd like to have with those tests is, at least:
> >
> > 1/ we don't need to run all the tests (but there is a set of 'quick'
> > tests we can run on daily basis)
> >
> > 2/ maintainers can run their default tests easily (using a combination
> > of Avocado tags)
> >
> > 3/ if a developer working on the PCI subsystem has to modify the MIPS
> > subsystem (for example), he should be able to run the MIPS tests before
> > sending his series.
>
> Keeping the test cases organized in subdirectories are a good
> idea, but don't think this is going to help us when we need to
> quickly enable/disable specific test cases on some CI systems.
>

Well, Eduardo, nobody said that directory locations should be used for
enabling/disabling or tagging/untagging tests in the first place. I think
it was clear for everybody from the outset that these features should have
their own mechanisms, which Cleber says already exist, but can't be used
because the test group still can't figure out (in some hamletesque way)
whether to blacklist or to whitelist, or how to name the tag for travis,
and tag for not travis, and if such tags should even exist, etc. - that is
my layman impression from recent discussions. And now when Philippe
suggested (in my opinion logical and reasonable) subdirectory, an endless
discussion begins: “To subdirectory or not to subdirectory? That is the
question.” Meanwhile, 4.1 is inexorably getting closer and closer, and with
each day, the value of any potential tests is decreasing.

Directory structure should be used in its usual and basic way: for
clustering files of similar nature, purpose, or origin, and I do certainly
support any reasonable subdirectory organization for your directory - and
you should think about it, and probably while doing that consult a little
bit other people from all walks of QEMU. We are ready to comply with your
final decision.

The good thing is that nothing is set in stone, everything can be changed
and improved, moving files is easy in git.

All that said, many thanks for reviewing patch 4/4.

Aleksandar



> Disabling a test case (or an entire category of test cases) known
> to be failing on some CI systems should require a one line patch,
> not moving files to a separate directory.
>
> >
> > > Having such test would be a big deal for me, not only that I could
run it
> > > manually or automatically every weekend, but I could ask submitters of
> > > critical changes: “Did you run this test that we have in Avocado
dir?”,
> > > without specifying test details, procedures, etc. All this is a BIG
deal
> > > for me.
> > >
> > > On the other hand, I agree that certain group of tests (envisioned for
> > > daily or so Travis CI) should have some stricter limitations and
structure.
> > > But right now I feel humpered by it, and this is counterproductive.
> > >
> > > So, we want freedom, responsibility and ownersheep of our tests.
Please
> > > give us the opportunity to get down on business and start writing
tests and
> > > start testing.
> > >
> > > Yours,
> > > Aleksandar
>
> --
> Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Eduardo Habkost 6 years, 8 months ago

On Fri, May 24, 2019 at 10:32:36PM +0200, Aleksandar Markovic wrote:
> On May 24, 2019 9:40 PM, "Eduardo Habkost" <ehabkost@redhat.com> wrote:
> >
> > On Thu, May 23, 2019 at 07:27:35PM +0200, Philippe Mathieu-Daudé wrote:
> > > On 5/23/19 7:11 PM, Aleksandar Markovic wrote:
> > > > On May 23, 2019 3:45 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> > > >>> From: "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>
> > > >>> On May 22, 2019 11:46 PM, "Cleber Rosa" <crosa@redhat.com> wrote:
> > > >>>>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> > > >>>>>
> > > >>>>> Until we actually have a mechanism to exclude the test case on
> > > >>>>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > >>>>> please don't merge patch 4/4 yet or it will break travis-ci.
> > > >>>>>
> > > >>>>> Cleber, Wainer, is it already possible to make "avocado run" skip
> > > >>>>> tests tagged with "slow"?
> > > >>>>>
> > > >>>>
> > > >>>> The mechanism exists, but we haven't tagged any test so far as
> slow.
> > > >>>>
> > > >>>
> > > >>> Cleber,
> > > >>>
> > > >>> For the test from patch 4/4, there is no dilemma - it should be in
> the
> > > >>> “slow” group, as Philippe envisioned and said, so that it is not
> > > > humpered
> > > >>> with stricter requirements for “fast” (default) group. Could you
> > > > explain us
> > > >>> how to do it, so that we can hopefully finally proceed?
> > > >>>
> > > >>
> > > >> Hi Aleksandar,
> > > >>
> > > >> The point is that there's no "group" definition at this point.  This
> is
> > > > the
> > > >> core of the discussion.
> > > >>
> > > >> I think we're close to converging to something simple and effective.
> > > > Please
> > > >> let us know what you think of the proposals given.
> > > >>
> > > >> Thanks!
> > > >> - Cleber.
> > > >>
> > > >
> > > > Cleber, hi.
> > > >
> > > > Thanks for responding.
> > > >
> > > > My views are very similar to Philippe's, but I will provide you with
> more
> > > > details of our (mips) perspective.
> > > >
> > > > As far as black/whitelist issues that is a moot point for us - we
> only want
> > > > to be able to have a way to tag a test within the test itself (so,
> without
> > > > updating some common files, external lists,etc.)
> > > >
> > > > In general, we would like to have a test environment where we would
> be able
> > > > to test what WE deem suitable for us, without feeling that we bother
> you or
> > > > anybody else, or that we are bothered by others.
> > > >
> > > > Let me give you a little extreme example: Let's say we want a complex
> test
> > > > that downloads components from let's say fifty internet location,
> executes
> > > > zillion test cases, and last two days. I wouldn't like anybody to ask
> me
> > > > “Why would you that?” or tell me “You can't do this.” or say “No, we
> did
> > > > not anticipate such tests, patch rejected.” I (we, people from mips)
> should
> > > > be able to define what I (we) need.
> > >
> > > Maybe we can use subdirectory like we do for the TCG tests (Aleksandar
> > > maintains tests/tcg/mips/). We should try to keep contribution upstream,
> > > so good idea/pattern can be reused by others.
> > >
> > > What I'd like to have with those tests is, at least:
> > >
> > > 1/ we don't need to run all the tests (but there is a set of 'quick'
> > > tests we can run on daily basis)
> > >
> > > 2/ maintainers can run their default tests easily (using a combination
> > > of Avocado tags)
> > >
> > > 3/ if a developer working on the PCI subsystem has to modify the MIPS
> > > subsystem (for example), he should be able to run the MIPS tests before
> > > sending his series.
> >
> > Keeping the test cases organized in subdirectories are a good
> > idea, but don't think this is going to help us when we need to
> > quickly enable/disable specific test cases on some CI systems.
> >
> 
> Well, Eduardo, nobody said that directory locations should be used for
> enabling/disabling or tagging/untagging tests in the first place. I think
> it was clear for everybody from the outset that these features should have
> their own mechanisms, which Cleber says already exist, but can't be used
> because the test group still can't figure out (in some hamletesque way)
> whether to blacklist or to whitelist, or how to name the tag for travis,
> and tag for not travis, and if such tags should even exist, etc. - that is
> my layman impression from recent discussions. And now when Philippe
> suggested (in my opinion logical and reasonable) subdirectory, an endless
> discussion begins: “To subdirectory or not to subdirectory? That is the
> question.” Meanwhile, 4.1 is inexorably getting closer and closer, and with
> each day, the value of any potential tests is decreasing.

I understand that seeing the discussions going on and the patches
taking too long to be included might be frustrating.

These discussions shouldn't get into the way of addressing other
problems.  We don't need to wait until all discussions have
finished before proposing new patches or before merging patches
that are considered good.


> 
> Directory structure should be used in its usual and basic way: for
> clustering files of similar nature, purpose, or origin, and I do certainly
> support any reasonable subdirectory organization for your directory - and
> you should think about it, and probably while doing that consult a little
> bit other people from all walks of QEMU. We are ready to comply with your
> final decision.

About subdirectories, specifically, note that I explicitly said
it was a good idea.  If somebody wants to send patches, they are
welcome.

If I'm doing something else that could be blocking people from
getting work done, I'd like to fix that.  I'm aware that
sometimes I take too long to review patches, but I hope other
developers can help us on the review work.

> 
> The good thing is that nothing is set in stone, everything can be changed
> and improved, moving files is easy in git.
> 
> All that said, many thanks for reviewing patch 4/4.
> 
> Aleksandar
> 
> 
> 
> > Disabling a test case (or an entire category of test cases) known
> > to be failing on some CI systems should require a one line patch,
> > not moving files to a separate directory.
> >
> > >
> > > > Having such test would be a big deal for me, not only that I could
> run it
> > > > manually or automatically every weekend, but I could ask submitters of
> > > > critical changes: “Did you run this test that we have in Avocado
> dir?”,
> > > > without specifying test details, procedures, etc. All this is a BIG
> deal
> > > > for me.
> > > >
> > > > On the other hand, I agree that certain group of tests (envisioned for
> > > > daily or so Travis CI) should have some stricter limitations and
> structure.
> > > > But right now I feel humpered by it, and this is counterproductive.
> > > >
> > > > So, we want freedom, responsibility and ownersheep of our tests.
> Please
> > > > give us the opportunity to get down on business and start writing
> tests and
> > > > start testing.
> > > >
> > > > Yours,
> > > > Aleksandar
> >
> > --
> > Eduardo

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Eduardo Habkost 6 years, 8 months ago

On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> 
> 
> ----- Original Message -----
> > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>, "Aleksandar Markovic"
> > <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> > "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta" <wainersm@redhat.com>
> > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > 
> > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > > Hi,
> > > 
> > > It was a rainy week-end here, so I invested it to automatize some
> > > of my MIPS tests.
> > > 
> > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > meant to run on a CI system but rather on a workstation previous
> > > to post a pull request.
> > > It can surely be improved, but it is a good starting point.
> > 
> > Until we actually have a mechanism to exclude the test case on
> > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > please don't merge patch 4/4 yet or it will break travis-ci.
> > 
> > Cleber, Wainer, is it already possible to make "avocado run" skip
> > tests tagged with "slow"?
> > 
> 
> The mechanism exists, but we haven't tagged any test so far as slow.
> 
> Should we define/document a criteria for a test to be slow?  Given
> that this is highly subjective, we have to think of:
> 
>  * Will we consider the average or maximum run time (the timeout
>    definition)?
>  
>  * For a single test, what is "slow"? Some rough numbers from Travis
>    CI[1] to help us with guidelines:
>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
>    - linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:  PASS (18.14 s)
>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)

I don't think we need to overthink this.  Whatever objective
criteria we choose, I'm sure we'll have to adapt them later due
to real world problems.

e.g.: is 396 seconds too slow?  I don't know, it depends: does it
break Travis and other CI systems often because of timeouts?  If
yes, then we should probably tag it as slow.

If having subjective criteria is really a problem (I don't think
it is), then we can call the tag "skip_travis", and stop worrying
about defining what exactly is "slow".


> 
>  * Do we want to set a maximum job timeout?  This way we can skip
>    tests after a given amount of time has passed.  Currently we interrupt
>    the test running when the job timeout is reached, but it's possible
>    to add a option so that no new tests will be started, but currently
>    running ones will be waited on.

I'm not sure I understand the suggestion to skip tests.  If we
skip tests after a timeout, how would we differentiate a test
being expectedly slow from a QEMU hang?

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Philippe Mathieu-Daudé 6 years, 8 months ago

On 5/23/19 1:07 AM, Eduardo Habkost wrote:
> On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
>> ----- Original Message -----
>>> From: "Eduardo Habkost" <ehabkost@redhat.com>
>>> On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
>>>> Hi,
>>>>
>>>> It was a rainy week-end here, so I invested it to automatize some
>>>> of my MIPS tests.
>>>>
>>>> The BootLinuxSshTest is not Global warming friendly, it is not
>>>> meant to run on a CI system but rather on a workstation previous
>>>> to post a pull request.
>>>> It can surely be improved, but it is a good starting point.
>>>
>>> Until we actually have a mechanism to exclude the test case on
>>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
>>> please don't merge patch 4/4 yet or it will break travis-ci.
>>>
>>> Cleber, Wainer, is it already possible to make "avocado run" skip
>>> tests tagged with "slow"?
>>>
>>
>> The mechanism exists, but we haven't tagged any test so far as slow.
>>
>> Should we define/document a criteria for a test to be slow?  Given
>> that this is highly subjective, we have to think of:
>>
>>  * Will we consider the average or maximum run time (the timeout
>>    definition)?
>>  
>>  * For a single test, what is "slow"? Some rough numbers from Travis
>>    CI[1] to help us with guidelines:
>>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
>>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
>>    - linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:  PASS (18.14 s)
>>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> 
> I don't think we need to overthink this.  Whatever objective
> criteria we choose, I'm sure we'll have to adapt them later due
> to real world problems.
> 
> e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> break Travis and other CI systems often because of timeouts?  If
> yes, then we should probably tag it as slow.
> 
> If having subjective criteria is really a problem (I don't think
> it is), then we can call the tag "skip_travis", and stop worrying
> about defining what exactly is "slow".

I'd go with a simpler "tags:travis-ci" whitelisting any job expecting to
run smoothly there.

Then we can add "slow" tests without having to worry about blacklisting
for Travis CI.
Also, Other CI can set different timeouts.

I'd like maintainers to add as many tests as they want to upstream, so
these tests can eventually run by anyone, then each maintainer is free
to select which particular set he wants to run as default.

>>  * Do we want to set a maximum job timeout?  This way we can skip
>>    tests after a given amount of time has passed.  Currently we interrupt
>>    the test running when the job timeout is reached, but it's possible
>>    to add a option so that no new tests will be started, but currently
>>    running ones will be waited on.
> 
> I'm not sure I understand the suggestion to skip tests.  If we
> skip tests after a timeout, how would we differentiate a test
> being expectedly slow from a QEMU hang?
>

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Cleber Rosa 6 years, 8 months ago


----- Original Message -----
> From: "Philippe Mathieu-Daudé" <philmd@redhat.com>
> To: "Eduardo Habkost" <ehabkost@redhat.com>, "Cleber Rosa" <crosa@redhat.com>
> Cc: "Aleksandar Rikalo" <arikalo@wavecomp.com>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>, "Wainer dos Santos
> Moschetta" <wainersm@redhat.com>, qemu-devel@nongnu.org, "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>,
> "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Aurelien Jarno" <aurelien@aurel32.net>
> Sent: Thursday, May 23, 2019 5:38:34 AM
> Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> 
> On 5/23/19 1:07 AM, Eduardo Habkost wrote:
> > On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> >> ----- Original Message -----
> >>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> >>> On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> >>>> Hi,
> >>>>
> >>>> It was a rainy week-end here, so I invested it to automatize some
> >>>> of my MIPS tests.
> >>>>
> >>>> The BootLinuxSshTest is not Global warming friendly, it is not
> >>>> meant to run on a CI system but rather on a workstation previous
> >>>> to post a pull request.
> >>>> It can surely be improved, but it is a good starting point.
> >>>
> >>> Until we actually have a mechanism to exclude the test case on
> >>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> >>> please don't merge patch 4/4 yet or it will break travis-ci.
> >>>
> >>> Cleber, Wainer, is it already possible to make "avocado run" skip
> >>> tests tagged with "slow"?
> >>>
> >>
> >> The mechanism exists, but we haven't tagged any test so far as slow.
> >>
> >> Should we define/document a criteria for a test to be slow?  Given
> >> that this is highly subjective, we have to think of:
> >>
> >>  * Will we consider the average or maximum run time (the timeout
> >>    definition)?
> >>  
> >>  * For a single test, what is "slow"? Some rough numbers from Travis
> >>    CI[1] to help us with guidelines:
> >>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
> >>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
> >>    -
> >>    linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> >>    PASS (18.14 s)
> >>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> > 
> > I don't think we need to overthink this.  Whatever objective
> > criteria we choose, I'm sure we'll have to adapt them later due
> > to real world problems.
> > 
> > e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> > break Travis and other CI systems often because of timeouts?  If
> > yes, then we should probably tag it as slow.
> > 
> > If having subjective criteria is really a problem (I don't think
> > it is), then we can call the tag "skip_travis", and stop worrying
> > about defining what exactly is "slow".
> 
> I'd go with a simpler "tags:travis-ci" whitelisting any job expecting to
> run smoothly there.
> 

My concern is what becomes of "make check-acceptance".  Should we introduce
another target, say, "make check-acceptance-ci" or just change its meaning
and reuse it?

> Then we can add "slow" tests without having to worry about blacklisting
> for Travis CI.
> Also, Other CI can set different timeouts.
> 
> I'd like maintainers to add as many tests as they want to upstream, so
> these tests can eventually run by anyone, then each maintainer is free
> to select which particular set he wants to run as default.
> 

OK, so this matches the idea of carefully curating a set of tests for
CI.  WRT white or blacklisting, I favor the approach that requires the
least effort from the developer to have its test enabled, so I'd go
with blacklisting.  I fear that simple tests will just sit on the repo
without being properly exercised if we need to whitelist them.

But, I'll certainly and gladly accept the majority's opinion here.  

Regards,
- Cleber.

> >>  * Do we want to set a maximum job timeout?  This way we can skip
> >>    tests after a given amount of time has passed.  Currently we interrupt
> >>    the test running when the job timeout is reached, but it's possible
> >>    to add a option so that no new tests will be started, but currently
> >>    running ones will be waited on.
> > 
> > I'm not sure I understand the suggestion to skip tests.  If we
> > skip tests after a timeout, how would we differentiate a test
> > being expectedly slow from a QEMU hang?
> > 
>

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Eduardo Habkost 6 years, 8 months ago

On Thu, May 23, 2019 at 09:28:00AM -0400, Cleber Rosa wrote:
> 
> 
> ----- Original Message -----
> > From: "Philippe Mathieu-Daudé" <philmd@redhat.com>
> > To: "Eduardo Habkost" <ehabkost@redhat.com>, "Cleber Rosa" <crosa@redhat.com>
> > Cc: "Aleksandar Rikalo" <arikalo@wavecomp.com>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>, "Wainer dos Santos
> > Moschetta" <wainersm@redhat.com>, qemu-devel@nongnu.org, "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>,
> > "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Aurelien Jarno" <aurelien@aurel32.net>
> > Sent: Thursday, May 23, 2019 5:38:34 AM
> > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > 
> > On 5/23/19 1:07 AM, Eduardo Habkost wrote:
> > > On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> > >> ----- Original Message -----
> > >>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> > >>> On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > >>>> Hi,
> > >>>>
> > >>>> It was a rainy week-end here, so I invested it to automatize some
> > >>>> of my MIPS tests.
> > >>>>
> > >>>> The BootLinuxSshTest is not Global warming friendly, it is not
> > >>>> meant to run on a CI system but rather on a workstation previous
> > >>>> to post a pull request.
> > >>>> It can surely be improved, but it is a good starting point.
> > >>>
> > >>> Until we actually have a mechanism to exclude the test case on
> > >>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > >>> please don't merge patch 4/4 yet or it will break travis-ci.
> > >>>
> > >>> Cleber, Wainer, is it already possible to make "avocado run" skip
> > >>> tests tagged with "slow"?
> > >>>
> > >>
> > >> The mechanism exists, but we haven't tagged any test so far as slow.
> > >>
> > >> Should we define/document a criteria for a test to be slow?  Given
> > >> that this is highly subjective, we have to think of:
> > >>
> > >>  * Will we consider the average or maximum run time (the timeout
> > >>    definition)?
> > >>  
> > >>  * For a single test, what is "slow"? Some rough numbers from Travis
> > >>    CI[1] to help us with guidelines:
> > >>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
> > >>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
> > >>    -
> > >>    linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> > >>    PASS (18.14 s)
> > >>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> > > 
> > > I don't think we need to overthink this.  Whatever objective
> > > criteria we choose, I'm sure we'll have to adapt them later due
> > > to real world problems.
> > > 
> > > e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> > > break Travis and other CI systems often because of timeouts?  If
> > > yes, then we should probably tag it as slow.
> > > 
> > > If having subjective criteria is really a problem (I don't think
> > > it is), then we can call the tag "skip_travis", and stop worrying
> > > about defining what exactly is "slow".
> > 
> > I'd go with a simpler "tags:travis-ci" whitelisting any job expecting to
> > run smoothly there.
> > 
> 
> My concern is what becomes of "make check-acceptance".  Should we introduce
> another target, say, "make check-acceptance-ci" or just change its meaning
> and reuse it?

What about "make check-acceptance TAG=travis-ci"?

> 
> > Then we can add "slow" tests without having to worry about blacklisting
> > for Travis CI.
> > Also, Other CI can set different timeouts.
> > 
> > I'd like maintainers to add as many tests as they want to upstream, so
> > these tests can eventually run by anyone, then each maintainer is free
> > to select which particular set he wants to run as default.
> > 
> 
> OK, so this matches the idea of carefully curating a set of tests for
> CI.  WRT white or blacklisting, I favor the approach that requires the
> least effort from the developer to have its test enabled, so I'd go
> with blacklisting.  I fear that simple tests will just sit on the repo
> without being properly exercised if we need to whitelist them.
> 

I agree.  I'd prefer the default case to be simple and not
require extra tags.  (i.e. tests without any tags would be run in
Travis by default).

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Aleksandar Markovic 6 years, 8 months ago

On May 23, 2019 11:31 PM, "Eduardo Habkost" <ehabkost@redhat.com> wrote:
>
> On Thu, May 23, 2019 at 09:28:00AM -0400, Cleber Rosa wrote:
> >
> >
> > ----- Original Message -----
> > > From: "Philippe Mathieu-Daudé" <philmd@redhat.com>
> > > To: "Eduardo Habkost" <ehabkost@redhat.com>, "Cleber Rosa" <
crosa@redhat.com>
> > > Cc: "Aleksandar Rikalo" <arikalo@wavecomp.com>, "Philippe
Mathieu-Daudé" <f4bug@amsat.org>, "Wainer dos Santos
> > > Moschetta" <wainersm@redhat.com>, qemu-devel@nongnu.org, "Aleksandar
Markovic" <aleksandar.m.mail@gmail.com>,
> > > "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Aurelien Jarno" <
aurelien@aurel32.net>
> > > Sent: Thursday, May 23, 2019 5:38:34 AM
> > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > >
> > > On 5/23/19 1:07 AM, Eduardo Habkost wrote:
> > > > On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> > > >> ----- Original Message -----
> > > >>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> > > >>> On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé
wrote:
> > > >>>> Hi,
> > > >>>>
> > > >>>> It was a rainy week-end here, so I invested it to automatize some
> > > >>>> of my MIPS tests.
> > > >>>>
> > > >>>> The BootLinuxSshTest is not Global warming friendly, it is not
> > > >>>> meant to run on a CI system but rather on a workstation previous
> > > >>>> to post a pull request.
> > > >>>> It can surely be improved, but it is a good starting point.
> > > >>>
> > > >>> Until we actually have a mechanism to exclude the test case on
> > > >>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > >>> please don't merge patch 4/4 yet or it will break travis-ci.
> > > >>>
> > > >>> Cleber, Wainer, is it already possible to make "avocado run" skip
> > > >>> tests tagged with "slow"?
> > > >>>
> > > >>
> > > >> The mechanism exists, but we haven't tagged any test so far as
slow.
> > > >>
> > > >> Should we define/document a criteria for a test to be slow?  Given
> > > >> that this is highly subjective, we have to think of:
> > > >>
> > > >>  * Will we consider the average or maximum run time (the timeout
> > > >>    definition)?
> > > >>
> > > >>  * For a single test, what is "slow"? Some rough numbers from
Travis
> > > >>    CI[1] to help us with guidelines:
> > > >>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS
(6.04 s)
> > > >>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS
(2.91 s)
> > > >>    -
> > > >>
linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> > > >>    PASS (18.14 s)
> > > >>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> > > >
> > > > I don't think we need to overthink this.  Whatever objective
> > > > criteria we choose, I'm sure we'll have to adapt them later due
> > > > to real world problems.
> > > >
> > > > e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> > > > break Travis and other CI systems often because of timeouts?  If
> > > > yes, then we should probably tag it as slow.
> > > >
> > > > If having subjective criteria is really a problem (I don't think
> > > > it is), then we can call the tag "skip_travis", and stop worrying
> > > > about defining what exactly is "slow".
> > >
> > > I'd go with a simpler "tags:travis-ci" whitelisting any job expecting
to
> > > run smoothly there.
> > >
> >
> > My concern is what becomes of "make check-acceptance".  Should we
introduce
> > another target, say, "make check-acceptance-ci" or just change its
meaning
> > and reuse it?
>
> What about "make check-acceptance TAG=travis-ci"?
>
> >
> > > Then we can add "slow" tests without having to worry about
blacklisting
> > > for Travis CI.
> > > Also, Other CI can set different timeouts.
> > >
> > > I'd like maintainers to add as many tests as they want to upstream, so
> > > these tests can eventually run by anyone, then each maintainer is free
> > > to select which particular set he wants to run as default.
> > >
> >
> > OK, so this matches the idea of carefully curating a set of tests for
> > CI.  WRT white or blacklisting, I favor the approach that requires the
> > least effort from the developer to have its test enabled, so I'd go
> > with blacklisting.  I fear that simple tests will just sit on the repo
> > without being properly exercised if we need to whitelist them.
> >
>
> I agree.  I'd prefer the default case to be simple and not
> require extra tags.  (i.e. tests without any tags would be run in
> Travis by default).
>

Eduardo,

You are confusing me here.

You first suggest:

> What about "make check-acceptance TAG=travis-ci"?

... and then say:

> ...tests without any tags would be run in Travis by default.

For casual observers like me it is contradictory, I must be missing
something here, no?

Regards,
Aleksandar

> --
> Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Eduardo Habkost 6 years, 8 months ago

On Fri, May 24, 2019 at 03:45:56PM +0200, Aleksandar Markovic wrote:
> On May 23, 2019 11:31 PM, "Eduardo Habkost" <ehabkost@redhat.com> wrote:
> >
> > On Thu, May 23, 2019 at 09:28:00AM -0400, Cleber Rosa wrote:
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Philippe Mathieu-Daudé" <philmd@redhat.com>
> > > > To: "Eduardo Habkost" <ehabkost@redhat.com>, "Cleber Rosa" <
> crosa@redhat.com>
> > > > Cc: "Aleksandar Rikalo" <arikalo@wavecomp.com>, "Philippe
> Mathieu-Daudé" <f4bug@amsat.org>, "Wainer dos Santos
> > > > Moschetta" <wainersm@redhat.com>, qemu-devel@nongnu.org, "Aleksandar
> Markovic" <aleksandar.m.mail@gmail.com>,
> > > > "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Aurelien Jarno" <
> aurelien@aurel32.net>
> > > > Sent: Thursday, May 23, 2019 5:38:34 AM
> > > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > > >
> > > > On 5/23/19 1:07 AM, Eduardo Habkost wrote:
> > > > > On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> > > > >> ----- Original Message -----
> > > > >>> From: "Eduardo Habkost" <ehabkost@redhat.com>
> > > > >>> On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé
> wrote:
> > > > >>>> Hi,
> > > > >>>>
> > > > >>>> It was a rainy week-end here, so I invested it to automatize some
> > > > >>>> of my MIPS tests.
> > > > >>>>
> > > > >>>> The BootLinuxSshTest is not Global warming friendly, it is not
> > > > >>>> meant to run on a CI system but rather on a workstation previous
> > > > >>>> to post a pull request.
> > > > >>>> It can surely be improved, but it is a good starting point.
> > > > >>>
> > > > >>> Until we actually have a mechanism to exclude the test case on
> > > > >>> travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > > >>> please don't merge patch 4/4 yet or it will break travis-ci.
> > > > >>>
> > > > >>> Cleber, Wainer, is it already possible to make "avocado run" skip
> > > > >>> tests tagged with "slow"?
> > > > >>>
> > > > >>
> > > > >> The mechanism exists, but we haven't tagged any test so far as
> slow.
> > > > >>
> > > > >> Should we define/document a criteria for a test to be slow?  Given
> > > > >> that this is highly subjective, we have to think of:
> > > > >>
> > > > >>  * Will we consider the average or maximum run time (the timeout
> > > > >>    definition)?
> > > > >>
> > > > >>  * For a single test, what is "slow"? Some rough numbers from
> Travis
> > > > >>    CI[1] to help us with guidelines:
> > > > >>    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS
> (6.04 s)
> > > > >>    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS
> (2.91 s)
> > > > >>    -
> > > > >>
> linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> > > > >>    PASS (18.14 s)
> > > > >>    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> > > > >
> > > > > I don't think we need to overthink this.  Whatever objective
> > > > > criteria we choose, I'm sure we'll have to adapt them later due
> > > > > to real world problems.
> > > > >
> > > > > e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> > > > > break Travis and other CI systems often because of timeouts?  If
> > > > > yes, then we should probably tag it as slow.
> > > > >
> > > > > If having subjective criteria is really a problem (I don't think
> > > > > it is), then we can call the tag "skip_travis", and stop worrying
> > > > > about defining what exactly is "slow".
> > > >
> > > > I'd go with a simpler "tags:travis-ci" whitelisting any job expecting
> to
> > > > run smoothly there.
> > > >
> > >
> > > My concern is what becomes of "make check-acceptance".  Should we
> introduce
> > > another target, say, "make check-acceptance-ci" or just change its
> meaning
> > > and reuse it?
> >
> > What about "make check-acceptance TAG=travis-ci"?
> >
> > >
> > > > Then we can add "slow" tests without having to worry about
> blacklisting
> > > > for Travis CI.
> > > > Also, Other CI can set different timeouts.
> > > >
> > > > I'd like maintainers to add as many tests as they want to upstream, so
> > > > these tests can eventually run by anyone, then each maintainer is free
> > > > to select which particular set he wants to run as default.
> > > >
> > >
> > > OK, so this matches the idea of carefully curating a set of tests for
> > > CI.  WRT white or blacklisting, I favor the approach that requires the
> > > least effort from the developer to have its test enabled, so I'd go
> > > with blacklisting.  I fear that simple tests will just sit on the repo
> > > without being properly exercised if we need to whitelist them.
> > >
> >
> > I agree.  I'd prefer the default case to be simple and not
> > require extra tags.  (i.e. tests without any tags would be run in
> > Travis by default).
> >
> 
> Eduardo,
> 
> You are confusing me here.
> 
> You first suggest:
> 
> > What about "make check-acceptance TAG=travis-ci"?
> 

I was just trying to suggest using make variables as input to
check-acceptance, instead of creating separate makefile rules for
each set of test cases.  But you are right:

> ... and then say:
> 
> > ...tests without any tags would be run in Travis by default.
> 
> For casual observers like me it is contradictory, I must be missing
> something here, no?

Yes, if we use tags to exclude tests, the command line would look
different.  Maybe something like:

  make check-acceptance EXCLUDE_TAGS=skip-travis

The exact format of the arguments don't matter to me, as long as
we don't require people to write new makefile rules just because
they want to run a different set of test cases.

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Cleber Rosa 6 years, 8 months ago


----- Original Message -----
> From: "Eduardo Habkost" <ehabkost@redhat.com>
> To: "Cleber Rosa" <crosa@redhat.com>
> Cc: "Philippe Mathieu-Daudé" <f4bug@amsat.org>, qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
> "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Aurelien
> Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta" <wainersm@redhat.com>
> Sent: Wednesday, May 22, 2019 7:07:05 PM
> Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> 
> On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> > 
> > 
> > ----- Original Message -----
> > > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > > To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> > > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
> > > "Aleksandar Markovic"
> > > <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic"
> > > <amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> > > "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta"
> > > <wainersm@redhat.com>
> > > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > > 
> > > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > > > Hi,
> > > > 
> > > > It was a rainy week-end here, so I invested it to automatize some
> > > > of my MIPS tests.
> > > > 
> > > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > > meant to run on a CI system but rather on a workstation previous
> > > > to post a pull request.
> > > > It can surely be improved, but it is a good starting point.
> > > 
> > > Until we actually have a mechanism to exclude the test case on
> > > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > please don't merge patch 4/4 yet or it will break travis-ci.
> > > 
> > > Cleber, Wainer, is it already possible to make "avocado run" skip
> > > tests tagged with "slow"?
> > > 
> > 
> > The mechanism exists, but we haven't tagged any test so far as slow.
> > 
> > Should we define/document a criteria for a test to be slow?  Given
> > that this is highly subjective, we have to think of:
> > 
> >  * Will we consider the average or maximum run time (the timeout
> >    definition)?
> >  
> >  * For a single test, what is "slow"? Some rough numbers from Travis
> >    CI[1] to help us with guidelines:
> >    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
> >    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
> >    -
> >    linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> >    PASS (18.14 s)
> >    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> 
> I don't think we need to overthink this.  Whatever objective
> criteria we choose, I'm sure we'll have to adapt them later due
> to real world problems.
> 
> e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> break Travis and other CI systems often because of timeouts?  If
> yes, then we should probably tag it as slow.
> 

It's not only that.  We're close to a point where we'll need to
determine whether "make check-acceptance" will work as a generic
enough default for most user on their environments and most CI
systems.

As an example, this job ran 5 fairly slow tests (which I'm preparing
to send):

  https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518

Those are justifiably slow, given the fact that they boot a full
Fedora 30 system using TCG.  The job has a cumulative execution time
of ~39 minutes.  That leaves only 11 minutes to spare on the Travis
CI environment.  If they all exercised close to their 600s allowances
(timeout), the Travis job would have failed. 

Having said that, if a CI failure is supposed to be a major breakage,
which I believe it's the right mind set and a worthy goal, we should
limit the amount of tests we run so that their *maximum* execution
time does not exceed the maximum job time limit.

> If having subjective criteria is really a problem (I don't think
> it is), then we can call the tag "skip_travis", and stop worrying
> about defining what exactly is "slow".
> 
> 
> > 
> >  * Do we want to set a maximum job timeout?  This way we can skip
> >    tests after a given amount of time has passed.  Currently we interrupt
> >    the test running when the job timeout is reached, but it's possible
> >    to add a option so that no new tests will be started, but currently
> >    running ones will be waited on.
> 
> I'm not sure I understand the suggestion to skip tests.  If we
> skip tests after a timeout, how would we differentiate a test
> being expectedly slow from a QEMU hang?
> 
> --
> Eduardo
> 

Basically, what I meant is that we could attempt something like:

 * Job "Brave"
  - 50 tests, each with 60 seconds timeout = 50 min max
  - 60 tests, each with 1 second timeout  = 1 min max

If Job "Brave" is run on a system such as Travis, it *can* fail,
because it can go over the maximum Travis CI job limit of 50 min.
We could set an Avocado job timeout of say, 48 minutes, and tell
Avocado to mark the tests it wasn't able to spawn as "SKIPPED",
and do not report an overall error condition.

But, if we want to be more conservative (which I now realize is
the best mindset for this situation), we should stick to something
like:

 * Job "Coward"
  - 47 tests, each with 60 seconds timeout = 47 min max
  - 60 tests, each with 1 second timeout  = 1 min max

So my proposal is that we should:

 * Give ample timeouts to test (at least 2x their average
   run time on Travis CI)

 * Define the standard job (make check-acceptance) as a set
   of tests that can run under the Travis CI job (discounted
   the average QEMU build time)

This means that:

 * We'd tag some tests as "not-default", filtering them out
   of "make check-acceptance"

 * Supposing a developer is using a machine as least as powerful
   as the Travis CI environment, and assuming a build time of
   10 minutes, his "make check-acceptance" maximum execution
   time would be in the order of ~39 minutes.

I can work on adding the missing Avocado features, such as
the ability to list/count the maximum job time for the given test
selection. This should help us to maintain sound CI jobs, and good
user experience.

And finally, I'm sorry that I did overthink this... but I know
that the time for hard choices are coming fast.

Thanks,
- Cleber.

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Posted by Eduardo Habkost 6 years, 8 months ago

On Wed, May 22, 2019 at 09:04:17PM -0400, Cleber Rosa wrote:
> 
> 
> ----- Original Message -----
> > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > To: "Cleber Rosa" <crosa@redhat.com>
> > Cc: "Philippe Mathieu-Daudé" <f4bug@amsat.org>, qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
> > "Aleksandar Markovic" <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic" <amarkovic@wavecomp.com>, "Aurelien
> > Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta" <wainersm@redhat.com>
> > Sent: Wednesday, May 22, 2019 7:07:05 PM
> > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > 
> > On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Eduardo Habkost" <ehabkost@redhat.com>
> > > > To: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
> > > > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arikalo@wavecomp.com>,
> > > > "Aleksandar Markovic"
> > > > <aleksandar.m.mail@gmail.com>, "Aleksandar Markovic"
> > > > <amarkovic@wavecomp.com>, "Cleber Rosa" <crosa@redhat.com>,
> > > > "Aurelien Jarno" <aurelien@aurel32.net>, "Wainer dos Santos Moschetta"
> > > > <wainersm@redhat.com>
> > > > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > > > 
> > > > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > > > > Hi,
> > > > > 
> > > > > It was a rainy week-end here, so I invested it to automatize some
> > > > > of my MIPS tests.
> > > > > 
> > > > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > > > meant to run on a CI system but rather on a workstation previous
> > > > > to post a pull request.
> > > > > It can surely be improved, but it is a good starting point.
> > > > 
> > > > Until we actually have a mechanism to exclude the test case on
> > > > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > > please don't merge patch 4/4 yet or it will break travis-ci.
> > > > 
> > > > Cleber, Wainer, is it already possible to make "avocado run" skip
> > > > tests tagged with "slow"?
> > > > 
> > > 
> > > The mechanism exists, but we haven't tagged any test so far as slow.
> > > 
> > > Should we define/document a criteria for a test to be slow?  Given
> > > that this is highly subjective, we have to think of:
> > > 
> > >  * Will we consider the average or maximum run time (the timeout
> > >    definition)?
> > >  
> > >  * For a single test, what is "slow"? Some rough numbers from Travis
> > >    CI[1] to help us with guidelines:
> > >    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
> > >    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
> > >    -
> > >    linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> > >    PASS (18.14 s)
> > >    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> > 
> > I don't think we need to overthink this.  Whatever objective
> > criteria we choose, I'm sure we'll have to adapt them later due
> > to real world problems.
> > 
> > e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> > break Travis and other CI systems often because of timeouts?  If
> > yes, then we should probably tag it as slow.
> > 
> 
> It's not only that.  We're close to a point where we'll need to
> determine whether "make check-acceptance" will work as a generic
> enough default for most user on their environments and most CI
> systems.
> 
> As an example, this job ran 5 fairly slow tests (which I'm preparing
> to send):
> 
>   https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518
> 
> Those are justifiably slow, given the fact that they boot a full
> Fedora 30 system using TCG.  The job has a cumulative execution time
> of ~39 minutes.  That leaves only 11 minutes to spare on the Travis
> CI environment.  If they all exercised close to their 600s allowances
> (timeout), the Travis job would have failed. 
> 
> Having said that, if a CI failure is supposed to be a major breakage,
> which I believe it's the right mind set and a worthy goal, we should
> limit the amount of tests we run so that their *maximum* execution
> time does not exceed the maximum job time limit.
> 
> > If having subjective criteria is really a problem (I don't think
> > it is), then we can call the tag "skip_travis", and stop worrying
> > about defining what exactly is "slow".
> > 
> > 
> > > 
> > >  * Do we want to set a maximum job timeout?  This way we can skip
> > >    tests after a given amount of time has passed.  Currently we interrupt
> > >    the test running when the job timeout is reached, but it's possible
> > >    to add a option so that no new tests will be started, but currently
> > >    running ones will be waited on.
> > 
> > I'm not sure I understand the suggestion to skip tests.  If we
> > skip tests after a timeout, how would we differentiate a test
> > being expectedly slow from a QEMU hang?
> > 
> > --
> > Eduardo
> > 
> 
> Basically, what I meant is that we could attempt something like:
> 
>  * Job "Brave"
>   - 50 tests, each with 60 seconds timeout = 50 min max
>   - 60 tests, each with 1 second timeout  = 1 min max
> 
> If Job "Brave" is run on a system such as Travis, it *can* fail,
> because it can go over the maximum Travis CI job limit of 50 min.
> We could set an Avocado job timeout of say, 48 minutes, and tell
> Avocado to mark the tests it wasn't able to spawn as "SKIPPED",
> and do not report an overall error condition.

Oh, that would be a nice feature.  But while we don't have it,
the following proposal would work too.

> 
> But, if we want to be more conservative (which I now realize is
> the best mindset for this situation), we should stick to something
> like:
> 
>  * Job "Coward"
>   - 47 tests, each with 60 seconds timeout = 47 min max
>   - 60 tests, each with 1 second timeout  = 1 min max
> 
> So my proposal is that we should:
> 
>  * Give ample timeouts to test (at least 2x their average
>    run time on Travis CI)
> 
>  * Define the standard job (make check-acceptance) as a set
>    of tests that can run under the Travis CI job (discounted
>    the average QEMU build time)

Agreed.

> 
> This means that:
> 
>  * We'd tag some tests as "not-default", filtering them out
>    of "make check-acceptance"
> 
>  * Supposing a developer is using a machine as least as powerful
>    as the Travis CI environment, and assuming a build time of
>    10 minutes, his "make check-acceptance" maximum execution
>    time would be in the order of ~39 minutes.
> 
> I can work on adding the missing Avocado features, such as
> the ability to list/count the maximum job time for the given test
> selection. This should help us to maintain sound CI jobs, and good
> user experience.

Sounds good to me.

> 
> And finally, I'm sorry that I did overthink this... but I know
> that the time for hard choices are coming fast.

The above proposals are cool, I don't think they are
overthinking.

I only meant that we shouldn't be looking to a formal definition
of "slow", because "what exactly is a slow job?" isn't the
important question we should be asking.  "How to avoid timeouts
on CI jobs" is the important question, and your proposals above
help us address that.

-- 
Eduardo