[PATCH] gitlab-ci: split clang-user to avoid timeout

Stefan Hajnoczi posted 1 patch 1 year, 6 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20221103212321.387738-1-stefanha@redhat.com
Maintainers: "Alex Bennée" <alex.bennee@linaro.org>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Thomas Huth <thuth@redhat.com>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Beraldo Leal <bleal@redhat.com>
.gitlab-ci.d/buildtest-template.yml | 11 +++++++++++
.gitlab-ci.d/buildtest.yml          | 18 +++++++++---------
2 files changed, 20 insertions(+), 9 deletions(-)
[PATCH] gitlab-ci: split clang-user to avoid timeout
Posted by Stefan Hajnoczi 1 year, 6 months ago
GitLab CI times out when the clang-user job takes over 1 hour. Split it
into parts that check various architectures.

An alternative is to have one job per architecture but that clutters the
pipeline view and maybe there is some sharing when multiple targets are
built at once.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 .gitlab-ci.d/buildtest-template.yml | 11 +++++++++++
 .gitlab-ci.d/buildtest.yml          | 18 +++++++++---------
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/.gitlab-ci.d/buildtest-template.yml b/.gitlab-ci.d/buildtest-template.yml
index 73ecfabb8d..38b055e139 100644
--- a/.gitlab-ci.d/buildtest-template.yml
+++ b/.gitlab-ci.d/buildtest-template.yml
@@ -81,3 +81,14 @@
     - du -chs ${CI_PROJECT_DIR}/avocado-cache
   variables:
     QEMU_JOB_AVOCADO: 1
+
+.clang-user-template:
+  extends: .native_build_job_template
+  needs:
+    job: amd64-debian-user-cross-container
+  variables:
+    IMAGE: debian-all-test-cross
+    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --disable-system
+      --extra-cflags=-fsanitize=undefined
+      --extra-cflags=-fno-sanitize-recover=undefined
+    MAKE_CHECK_ARGS: check-unit check-tcg
diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 6c05c46397..116fce4e8f 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -323,16 +323,16 @@ clang-system:
       ppc-softmmu s390x-softmmu
     MAKE_CHECK_ARGS: check-qtest check-tcg
 
-clang-user:
-  extends: .native_build_job_template
-  needs:
-    job: amd64-debian-user-cross-container
+# clang-user takes too long so split it into parts
+clang-user-part1:
+  extends: .clang-user-template
   variables:
-    IMAGE: debian-all-test-cross
-    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --disable-system
-      --target-list-exclude=microblazeel-linux-user,aarch64_be-linux-user,i386-linux-user,m68k-linux-user,mipsn32el-linux-user,xtensaeb-linux-user
-      --extra-cflags=-fsanitize=undefined --extra-cflags=-fno-sanitize-recover=undefined
-    MAKE_CHECK_ARGS: check-unit check-tcg
+    TARGETS: aarch64-linux-user,alpha-linux-user,armeb-linux-user,arm-linux-user,cris-linux-user,hexagon-linux-user,hppa-linux-user,loongarch64-linux-user,microblaze-linux-user,mips64el-linux-user,mips64-linux-user,mipsel-linux-user,mips-linux-user,mipsn32-linux-user
+
+clang-user-part2:
+  extends: .clang-user-template
+  variables:
+    TARGETS: nios2-linux-user,or1k-linux-user,ppc64le-linux-user,ppc64-linux-user,ppc-linux-user,riscv32-linux-user,riscv64-linux-user,s390x-linux-user,sh4eb-linux-user,sh4-linux-user,sparc32plus-linux-user,sparc64-linux-user,sparc-linux-user,x86_64-linux-user,xtensa-linux-user
 
 # Set LD_JOBS=1 because this requires LTO and ld consumes a large amount of memory.
 # On gitlab runners, default value sometimes end up calling 2 lds concurrently and
-- 
2.38.1
Re: [PATCH] gitlab-ci: split clang-user to avoid timeout
Posted by Thomas Huth 1 year, 6 months ago
On 03/11/2022 22.23, Stefan Hajnoczi wrote:
> GitLab CI times out when the clang-user job takes over 1 hour.

Oh, that's new to me ... is that a regression? Has something become slower? 
Or did we just add more stuff to the user builds recently?

Anyway, if it's just taking a little bit longer than 1h, it's likely better 
to bump the timeout by 10 minutes (to 70 minutes), I guess that will still 
take less CI minutes to run than to have two jobs.

  Thomas
Re: [PATCH] gitlab-ci: split clang-user to avoid timeout
Posted by Philippe Mathieu-Daudé 1 year, 6 months ago
On 4/11/22 07:27, Thomas Huth wrote:
> On 03/11/2022 22.23, Stefan Hajnoczi wrote:
>> GitLab CI times out when the clang-user job takes over 1 hour.
> 
> Oh, that's new to me ... is that a regression? Has something become 
> slower? Or did we just add more stuff to the user builds recently?

We added more TCG tests:

$ git diff --stat v7.1.0.. -- tests/tcg/
  tests/tcg/Makefile.target                              |   36 +-
  tests/tcg/aarch64/Makefile.softmmu-target              |   11 +-
  tests/tcg/aarch64/Makefile.target                      |   15 +-
  tests/tcg/arm/Makefile.target                          |    9 +-
  tests/tcg/cris/Makefile.target                         |    2 +-
  tests/tcg/hexagon/usr.c                                |   10 +
  tests/tcg/i386/Makefile.softmmu-target                 |    3 +-
  tests/tcg/i386/Makefile.target                         |   41 +-
  tests/tcg/i386/README                                  |    9 +
  tests/tcg/i386/test-3dnow.c                            |    3 +
  tests/tcg/i386/test-avx.c                              |  364 ++++++++++
  tests/tcg/i386/test-avx.py                             |  375 ++++++++++
  tests/tcg/i386/test-i386-bmi2.c                        |  169 ++++-
  tests/tcg/i386/test-i386.c                             |  575 
+--------------
  tests/tcg/i386/test-mmx.c                              |  315 ++++++++
  tests/tcg/i386/test-mmx.py                             |  244 +++++++
  tests/tcg/i386/x86.csv                                 | 4658 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  tests/tcg/multiarch/Makefile.target                    |   21 +-
  tests/tcg/multiarch/linux/linux-madvise.c              |   70 ++
  tests/tcg/multiarch/munmap-pthread.c                   |   79 +++
  tests/tcg/multiarch/noexec.c.inc                       |  139 ++++
  tests/tcg/multiarch/system/Makefile.softmmu-target     |    2 +-
  tests/tcg/nios2/10m50-ghrd.ld                          |   14 +-
  tests/tcg/nios2/Makefile.softmmu-target                |    3 +-
  tests/tcg/ppc64/Makefile.target                        |    8 +-
  tests/tcg/ppc64le/Makefile.target                      |   26 +-
  tests/tcg/riscv64/Makefile.target                      |    1 +
  tests/tcg/riscv64/noexec.c                             |   79 +++
  tests/tcg/s390x/Makefile.target                        |   34 +-
  tests/tcg/s390x/noexec.c                               |  106 +++
  tests/tcg/s390x/vistr.c                                |   45 ++
  tests/tcg/sh4/Makefile.target                          |   12 -
  tests/tcg/x86_64/Makefile.softmmu-target               |    3 +-
  tests/tcg/x86_64/Makefile.target                       |    7 +-
  tests/tcg/x86_64/noexec.c                              |   75 ++
  42 files changed, 6877 insertions(+), 686 deletions(-)

Also more s390x tests are going to be merged soon.

> Anyway, if it's just taking a little bit longer than 1h, it's likely 
> better to bump the timeout by 10 minutes (to 70 minutes), I guess that 
> will still take less CI minutes to run than to have two jobs.
> 
>   Thomas
> 


Re: [PATCH] gitlab-ci: split clang-user to avoid timeout
Posted by Alex Bennée 1 year, 6 months ago
Philippe Mathieu-Daudé <philmd@linaro.org> writes:

> On 4/11/22 07:27, Thomas Huth wrote:
>> On 03/11/2022 22.23, Stefan Hajnoczi wrote:
>>> GitLab CI times out when the clang-user job takes over 1 hour.
>> Oh, that's new to me ... is that a regression? Has something become
>> slower? Or did we just add more stuff to the user builds recently?
>
> We added more TCG tests:
>
> $ git diff --stat v7.1.0.. -- tests/tcg/
>  tests/tcg/Makefile.target                              |   36 +-
<snip>

but are any of them particularly slow? tcg tests are generally quick (or
at least should be).

>
> Also more s390x tests are going to be merged soon.
>
>> Anyway, if it's just taking a little bit longer than 1h, it's likely
>> better to bump the timeout by 10 minutes (to 70 minutes), I guess
>> that will still take less CI minutes to run than to have two jobs.
>>   Thomas
>> 


-- 
Alex Bennée
Re: [PATCH] gitlab-ci: split clang-user to avoid timeout
Posted by Philippe Mathieu-Daudé 1 year, 6 months ago
+Richard

On 3/11/22 22:23, Stefan Hajnoczi wrote:
> GitLab CI times out when the clang-user job takes over 1 hour. Split it
> into parts that check various architectures.
> 
> An alternative is to have one job per architecture but that clutters the
> pipeline view and maybe there is some sharing when multiple targets are
> built at once.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   .gitlab-ci.d/buildtest-template.yml | 11 +++++++++++
>   .gitlab-ci.d/buildtest.yml          | 18 +++++++++---------
>   2 files changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/.gitlab-ci.d/buildtest-template.yml b/.gitlab-ci.d/buildtest-template.yml
> index 73ecfabb8d..38b055e139 100644
> --- a/.gitlab-ci.d/buildtest-template.yml
> +++ b/.gitlab-ci.d/buildtest-template.yml
> @@ -81,3 +81,14 @@
>       - du -chs ${CI_PROJECT_DIR}/avocado-cache
>     variables:
>       QEMU_JOB_AVOCADO: 1
> +
> +.clang-user-template:
> +  extends: .native_build_job_template
> +  needs:
> +    job: amd64-debian-user-cross-container
> +  variables:
> +    IMAGE: debian-all-test-cross
> +    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --disable-system
> +      --extra-cflags=-fsanitize=undefined
> +      --extra-cflags=-fno-sanitize-recover=undefined
> +    MAKE_CHECK_ARGS: check-unit check-tcg
> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> index 6c05c46397..116fce4e8f 100644
> --- a/.gitlab-ci.d/buildtest.yml
> +++ b/.gitlab-ci.d/buildtest.yml
> @@ -323,16 +323,16 @@ clang-system:
>         ppc-softmmu s390x-softmmu
>       MAKE_CHECK_ARGS: check-qtest check-tcg
>   
> -clang-user:
> -  extends: .native_build_job_template
> -  needs:
> -    job: amd64-debian-user-cross-container
> +# clang-user takes too long so split it into parts
> +clang-user-part1:
> +  extends: .clang-user-template
>     variables:
> -    IMAGE: debian-all-test-cross
> -    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --disable-system
> -      --target-list-exclude=microblazeel-linux-user,aarch64_be-linux-user,i386-linux-user,m68k-linux-user,mipsn32el-linux-user,xtensaeb-linux-user

We can exclude these targets which are a bit redundant:

 
armeb-linux-user,mips64-linux-user,mipsel-linux-user,mipsn32-linux-user,ppc64-linux-user,sh4-linux-user,sparc-linux-user,riscv32-linux-user

Alternatively, instead of using the exclude pattern, we can switch to 
including the targets which do have tcg tests:

$ ls -1 tests/tcg/
Makefile.target
README
aarch64
alpha
arm
cris
hexagon
hppa
i386
loongarch64
m68k
minilib
mips
multiarch
nios2
openrisc
ppc
ppc64
ppc64le
riscv64
s390x
sh4
sparc64
tricore
x86_64
xtensa

Although we have 'multiarch' :/

We can also drop check-unit, but not sure if it saves much.

> -      --extra-cflags=-fsanitize=undefined --extra-cflags=-fno-sanitize-recover=undefined
> -    MAKE_CHECK_ARGS: check-unit check-tcg
> +    TARGETS: aarch64-linux-user,alpha-linux-user,armeb-linux-user,arm-linux-user,cris-linux-user,hexagon-linux-user,hppa-linux-user,loongarch64-linux-user,microblaze-linux-user,mips64el-linux-user,mips64-linux-user,mipsel-linux-user,mips-linux-user,mipsn32-linux-user
> +
> +clang-user-part2:
> +  extends: .clang-user-template
> +  variables:
> +    TARGETS: nios2-linux-user,or1k-linux-user,ppc64le-linux-user,ppc64-linux-user,ppc-linux-user,riscv32-linux-user,riscv64-linux-user,s390x-linux-user,sh4eb-linux-user,sh4-linux-user,sparc32plus-linux-user,sparc64-linux-user,sparc-linux-user,x86_64-linux-user,xtensa-linux-user
>   
>   # Set LD_JOBS=1 because this requires LTO and ld consumes a large amount of memory.
>   # On gitlab runners, default value sometimes end up calling 2 lds concurrently and
Re: [PATCH] gitlab-ci: split clang-user to avoid timeout
Posted by Richard Henderson 1 year, 6 months ago
On 11/4/22 09:32, Philippe Mathieu-Daudé wrote:
> +Richard
> 
> On 3/11/22 22:23, Stefan Hajnoczi wrote:
>> GitLab CI times out when the clang-user job takes over 1 hour. Split it
>> into parts that check various architectures.
>>
>> An alternative is to have one job per architecture but that clutters the
>> pipeline view and maybe there is some sharing when multiple targets are
>> built at once.
>>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>   .gitlab-ci.d/buildtest-template.yml | 11 +++++++++++
>>   .gitlab-ci.d/buildtest.yml          | 18 +++++++++---------
>>   2 files changed, 20 insertions(+), 9 deletions(-)
>>
>> diff --git a/.gitlab-ci.d/buildtest-template.yml b/.gitlab-ci.d/buildtest-template.yml
>> index 73ecfabb8d..38b055e139 100644
>> --- a/.gitlab-ci.d/buildtest-template.yml
>> +++ b/.gitlab-ci.d/buildtest-template.yml
>> @@ -81,3 +81,14 @@
>>       - du -chs ${CI_PROJECT_DIR}/avocado-cache
>>     variables:
>>       QEMU_JOB_AVOCADO: 1
>> +
>> +.clang-user-template:
>> +  extends: .native_build_job_template
>> +  needs:
>> +    job: amd64-debian-user-cross-container
>> +  variables:
>> +    IMAGE: debian-all-test-cross
>> +    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --disable-system
>> +      --extra-cflags=-fsanitize=undefined
>> +      --extra-cflags=-fno-sanitize-recover=undefined
>> +    MAKE_CHECK_ARGS: check-unit check-tcg
>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
>> index 6c05c46397..116fce4e8f 100644
>> --- a/.gitlab-ci.d/buildtest.yml
>> +++ b/.gitlab-ci.d/buildtest.yml
>> @@ -323,16 +323,16 @@ clang-system:
>>         ppc-softmmu s390x-softmmu
>>       MAKE_CHECK_ARGS: check-qtest check-tcg
>> -clang-user:
>> -  extends: .native_build_job_template
>> -  needs:
>> -    job: amd64-debian-user-cross-container
>> +# clang-user takes too long so split it into parts
>> +clang-user-part1:
>> +  extends: .clang-user-template
>>     variables:
>> -    IMAGE: debian-all-test-cross
>> -    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --disable-system
>> -      
>> --target-list-exclude=microblazeel-linux-user,aarch64_be-linux-user,i386-linux-user,m68k-linux-user,mipsn32el-linux-user,xtensaeb-linux-user
> 
> We can exclude these targets which are a bit redundant:
> 
> 
> armeb-linux-user,mips64-linux-user,mipsel-linux-user,mipsn32-linux-user,ppc64-linux-user,sh4-linux-user,sparc-linux-user,riscv32-linux-user
> 
> Alternatively, instead of using the exclude pattern, we can switch to including the 
> targets which do have tcg tests:
> 
> $ ls -1 tests/tcg/
> Makefile.target
> README
> aarch64
> alpha
> arm
> cris
> hexagon
> hppa
> i386
> loongarch64
> m68k
> minilib
> mips
> multiarch
> nios2
> openrisc
> ppc
> ppc64
> ppc64le
> riscv64
> s390x
> sh4
> sparc64
> tricore
> x86_64
> xtensa
> 
> Although we have 'multiarch' :/

If we're talking about tests, this runs on debian-all-test-cross, which has fewer 
cross-compilers than that.

However, the main purpose of clang-user is to make sure that stuff *builds* with clang, as 
opposed to gcc, which is where we've seen most problems in the past.  So we do want as 
much coverage across targets/*/ as possible, even if cross-compilers for tests are not 
available.

I agree that we can drop some redundancy, like aarch64_be, armbe, mips{set}, riscv32, 
which have no remarkable difference in linux-user/.  But be careful of e.g. ppc64 vs 
ppc64le and sparc vs sparc64 which have very different ABIs.

Perhaps an interesting split would be those guests supported by debian-all-test-cross, for 
which we build + test, and the others, for which we build only.


r~