[PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes

Thomas Huth posted 1 patch 2 years, 5 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20211116163309.246602-1-thuth@redhat.com
Maintainers: "Philippe Mathieu-Daudé" <f4bug@amsat.org>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Willian Rampazzo <willianr@redhat.com>, "Alex Bennée" <alex.bennee@linaro.org>, Thomas Huth <thuth@redhat.com>
.gitlab-ci.d/cirrus.yml | 1 +
1 file changed, 1 insertion(+)
[PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes
Posted by Thomas Huth 2 years, 5 months ago
The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to
be scheduled, so while the build test itself finishes within 60 minutes,
the total run time of the jobs can be longer due to this waiting time.
Thus let's increase the timeout on the gitlab side a little bit, so
that these jobs are not marked as failing just because of the delay.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 .gitlab-ci.d/cirrus.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml
index e7b25e7427..22d42585e4 100644
--- a/.gitlab-ci.d/cirrus.yml
+++ b/.gitlab-ci.d/cirrus.yml
@@ -14,6 +14,7 @@
   stage: build
   image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master
   needs: []
+  timeout: 80m
   allow_failure: true
   script:
     - source .gitlab-ci.d/cirrus/$NAME.vars
-- 
2.27.0


Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes
Posted by Willian Rampazzo 2 years, 5 months ago
On Tue, Nov 16, 2021 at 1:33 PM Thomas Huth <thuth@redhat.com> wrote:
>
> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to
> be scheduled, so while the build test itself finishes within 60 minutes,
> the total run time of the jobs can be longer due to this waiting time.
> Thus let's increase the timeout on the gitlab side a little bit, so
> that these jobs are not marked as failing just because of the delay.
>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  .gitlab-ci.d/cirrus.yml | 1 +
>  1 file changed, 1 insertion(+)
>

Reviewed-by: Willian Rampazzo <willianr@redhat.com>


Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes
Posted by Daniel P. Berrangé 2 years, 5 months ago
On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote:
> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to
> be scheduled, so while the build test itself finishes within 60 minutes,
> the total run time of the jobs can be longer due to this waiting time.
> Thus let's increase the timeout on the gitlab side a little bit, so
> that these jobs are not marked as failing just because of the delay.

On a successful pipeline I see

 freebsd-11  - 28 minutes
 freebsd-12  - 57 minutes
 macos       - 30 minutes

We know cirrus allows 2 concurrent jobs, so from that I infer
that the freebsd-12 job was queued for ~30 minutes waiting for
either the freebsd-11 or macos job to finish, and then it
ran in 30 minutes, giving the ~60 minute total.

That's too close to the 60 minute gitlab default job timeout
for comfort - it can easily slip over 60 minutes by just a
small amount.

80 minutes will certainly help in the case where we
randomly take a little longer than 30 minutes to build,
and have 1 of the 3 jobs queued.

When we're running jobs on both master + staging, we can
have 2 jobs running and 4 more queued - 2 of those queued
might just finish in time, but 2 will definitely fail.
My patch will cut these extra jobs on master, so in common
case we only ever get 1 queued, which should work well in
combo with your patch here. That should be good enough
for the qemu-project namespace, unless someone is triggering
pipelines for stable branch staging at the same time as
the master branch staging.

If we do want to worry about more than 2 queued jobs
again for that reason, we might consider putting
it upto 100 minutes. That would give us enough slack to
have 4 queued jobs behind two running jobs and have
them all succeed

> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  .gitlab-ci.d/cirrus.yml | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml
> index e7b25e7427..22d42585e4 100644
> --- a/.gitlab-ci.d/cirrus.yml
> +++ b/.gitlab-ci.d/cirrus.yml
> @@ -14,6 +14,7 @@
>    stage: build
>    image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master
>    needs: []
> +  timeout: 80m
>    allow_failure: true
>    script:
>      - source .gitlab-ci.d/cirrus/$NAME.vars

Whether 80 or 100 minute, consider it

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes
Posted by Philippe Mathieu-Daudé 2 years, 5 months ago
On 11/16/21 17:49, Daniel P. Berrangé wrote:
> On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote:
>> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to
>> be scheduled, so while the build test itself finishes within 60 minutes,
>> the total run time of the jobs can be longer due to this waiting time.
>> Thus let's increase the timeout on the gitlab side a little bit, so
>> that these jobs are not marked as failing just because of the delay.
> 
> On a successful pipeline I see
> 
>  freebsd-11  - 28 minutes
>  freebsd-12  - 57 minutes
>  macos       - 30 minutes
> 
> We know cirrus allows 2 concurrent jobs, so from that I infer
> that the freebsd-12 job was queued for ~30 minutes waiting for
> either the freebsd-11 or macos job to finish, and then it
> ran in 30 minutes, giving the ~60 minute total.
> 
> That's too close to the 60 minute gitlab default job timeout
> for comfort - it can easily slip over 60 minutes by just a
> small amount.
> 
> 80 minutes will certainly help in the case where we
> randomly take a little longer than 30 minutes to build,
> and have 1 of the 3 jobs queued.
> 
> When we're running jobs on both master + staging, we can
> have 2 jobs running and 4 more queued - 2 of those queued
> might just finish in time, but 2 will definitely fail.
> My patch will cut these extra jobs on master, so in common
> case we only ever get 1 queued, which should work well in
> combo with your patch here. That should be good enough
> for the qemu-project namespace, unless someone is triggering
> pipelines for stable branch staging at the same time as
> the master branch staging.
> 
> If we do want to worry about more than 2 queued jobs
> again for that reason, we might consider putting
> it upto 100 minutes. That would give us enough slack to
> have 4 queued jobs behind two running jobs and have
> them all succeed
> 
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  .gitlab-ci.d/cirrus.yml | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml
>> index e7b25e7427..22d42585e4 100644
>> --- a/.gitlab-ci.d/cirrus.yml
>> +++ b/.gitlab-ci.d/cirrus.yml
>> @@ -14,6 +14,7 @@
>>    stage: build
>>    image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master
>>    needs: []
>> +  timeout: 80m
>>    allow_failure: true
>>    script:
>>      - source .gitlab-ci.d/cirrus/$NAME.vars
> 
> Whether 80 or 100 minute, consider it
> 
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

This pipeline took 1h51m09s:
https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds
But Richard restarted unstable jobs, which probably added time
to the total.

IIRC from a maintainer perspective 1h15 is the upper limit.
80m fits, 100m is over. Up to the project maintainers
(personally I don't have any objection, in particular if
this reduces the failures rate).

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>


Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes
Posted by Thomas Huth 2 years, 5 months ago
On 16/11/2021 18.09, Philippe Mathieu-Daudé wrote:
> On 11/16/21 17:49, Daniel P. Berrangé wrote:
>> On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote:
>>> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to
>>> be scheduled, so while the build test itself finishes within 60 minutes,
>>> the total run time of the jobs can be longer due to this waiting time.
>>> Thus let's increase the timeout on the gitlab side a little bit, so
>>> that these jobs are not marked as failing just because of the delay.
...>>> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml
>>> index e7b25e7427..22d42585e4 100644
>>> --- a/.gitlab-ci.d/cirrus.yml
>>> +++ b/.gitlab-ci.d/cirrus.yml
>>> @@ -14,6 +14,7 @@
>>>     stage: build
>>>     image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master
>>>     needs: []
>>> +  timeout: 80m
>>>     allow_failure: true
>>>     script:
>>>       - source .gitlab-ci.d/cirrus/$NAME.vars
>>
>> Whether 80 or 100 minute, consider it
>>
>> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> 
> This pipeline took 1h51m09s:
> https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds
> But Richard restarted unstable jobs, which probably added time
> to the total.
> 
> IIRC from a maintainer perspective 1h15 is the upper limit.
> 80m fits, 100m is over.

I think I agree ... I normally don't want to wait more than a little bit 
more than one hour, so 100 minutes feels too long already. We already have 
some 70m timeouts in other jobs, and one 80 minute timeout in 
.gitlab-ci.d/crossbuild-template.yml, so I'd say 80 minutes are really the 
upper boundary that we should use.

> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

Thank to all for your reviews!

  Thomas


Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes
Posted by Richard Henderson 2 years, 5 months ago
On 11/16/21 6:22 PM, Thomas Huth wrote:
> On 16/11/2021 18.09, Philippe Mathieu-Daudé wrote:
>> On 11/16/21 17:49, Daniel P. Berrangé wrote:
>>> On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote:
>>>> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to
>>>> be scheduled, so while the build test itself finishes within 60 minutes,
>>>> the total run time of the jobs can be longer due to this waiting time.
>>>> Thus let's increase the timeout on the gitlab side a little bit, so
>>>> that these jobs are not marked as failing just because of the delay.
> ...>>> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml
>>>> index e7b25e7427..22d42585e4 100644
>>>> --- a/.gitlab-ci.d/cirrus.yml
>>>> +++ b/.gitlab-ci.d/cirrus.yml
>>>> @@ -14,6 +14,7 @@
>>>>     stage: build
>>>>     image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master
>>>>     needs: []
>>>> +  timeout: 80m
>>>>     allow_failure: true
>>>>     script:
>>>>       - source .gitlab-ci.d/cirrus/$NAME.vars
>>>
>>> Whether 80 or 100 minute, consider it
>>>
>>> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
>>
>> This pipeline took 1h51m09s:
>> https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds
>> But Richard restarted unstable jobs, which probably added time
>> to the total.
>>
>> IIRC from a maintainer perspective 1h15 is the upper limit.
>> 80m fits, 100m is over.
> 
> I think I agree ... I normally don't want to wait more than a little bit more than one 
> hour, so 100 minutes feels too long already. We already have some 70m timeouts in other 
> jobs, and one 80 minute timeout in .gitlab-ci.d/crossbuild-template.yml, so I'd say 80 
> minutes are really the upper boundary that we should use.

We are also talking apples and oranges:
Gitlab timeouts are on the amount of time the job runs.
Cirrus timeouts appear to be on the amount of time the job is queued.

If cirrus would just not start accounting until the thing runs we'd be fine.


r~