.gitlab-ci.d/container-template.yml | 1 + 1 file changed, 1 insertion(+)
I'm not sure if this was discussed previously, but I noticed we're not pulling the images we push to the registry at every pipeline run. I would expect we don't actually need to rebuild container images at _every_ pipeline run, so I propose we add a "docker pull" to the container templates. We already have that for the docker-edk2|opensbi images. Some containers can take a long time to build (14 mins) and pulling the image first without building can cut the time to about 3 mins. With this we can save almost 2h of cumulative CI time per pipeline run: | master | pull-only | diff | container | 0:02:34 | 0:02:09 | 00:00:25 | alpha-debian-cross-container | 0:04:45 | 0:02:40 | 00:02:05 | amd64-alpine-container | 0:09:51 | 0:02:56 | 00:06:55 | amd64-centos8-container | 0:07:21 | 0:02:49 | 00:04:32 | amd64-debian-container | 0:06:00 | 0:02:37 | 00:03:23 | amd64-debian-cross-container | 0:14:22 | 0:03:41 | 00:10:41 | amd64-debian-user-cross-container | 0:10:14 | 0:03:24 | 00:06:50 | amd64-fedora-container | 0:12:09 | 0:02:49 | 00:09:20 | amd64-opensuse-leap-container | 0:07:33 | 0:02:45 | 00:04:48 | amd64-ubuntu2004-container | 0:08:28 | 0:03:07 | 00:05:21 | arm64-debian-cross-container | 0:04:27 | 0:02:58 | 00:01:29 | armel-debian-cross-container | 0:08:01 | 0:02:55 | 00:05:06 | armhf-debian-cross-container | 0:03:33 | 0:02:18 | 00:01:15 | cris-fedora-cross-container | 0:00:28 | 0:00:28 | 00:00:00 | docker-edk2 | 0:00:25 | 0:00:28 |-00:00:03 | docker-opensbi | 0:08:34 | 0:03:10 | 00:05:24 | hexagon-cross-container | 0:02:34 | 0:02:08 | 00:00:26 | hppa-debian-cross-container | 0:04:50 | 0:02:28 | 00:02:22 | i386-fedora-cross-container | 0:02:36 | 0:02:12 | 00:00:24 | m68k-debian-cross-container | 0:02:40 | 0:02:09 | 00:00:31 | mips-debian-cross-container | 0:02:38 | 0:02:09 | 00:00:29 | mips64-debian-cross-container | 0:04:28 | 0:02:48 | 00:01:40 | mips64el-debian-cross-container | 0:07:07 | 0:02:46 | 00:04:21 | mipsel-debian-cross-container | 0:03:51 | 0:02:21 | 00:01:30 | powerpc-test-cross-container | 0:08:52 | 0:03:00 | 00:05:52 | ppc64el-debian-cross-container | 0:06:07 | 0:02:49 | 00:03:18 | python-container | 0:04:37 | 0:02:26 | 00:02:11 | riscv64-debian-cross-container | 0:02:39 | 0:02:08 | 00:00:31 | riscv64-debian-test-cross-container | 0:08:03 | 0:03:00 | 00:05:03 | s390x-debian-cross-container | 0:02:34 | 0:02:08 | 00:00:26 | sh4-debian-cross-container | 0:02:37 | 0:02:09 | 00:00:28 | sparc64-debian-cross-container | 0:04:25 | 0:02:17 | 00:02:08 | tricore-debian-cross-container | 0:12:51 | 0:03:27 | 00:09:24 | win32-fedora-cross-container | 0:11:16 | 0:03:29 | 00:07:47 | win64-fedora-cross-container | 0:03:28 | 0:02:20 | 00:01:08 | xtensa-debian-cross-container | 01:57:30 | We would need to devise a mechanism (not included here) to force the re-build of the container images when needed, perhaps an environment variable or even a whole new "container build" stage before the "container" stage. What do you think? Fabiano Rosas (1): ci: Attempt to pull container images before building .gitlab-ci.d/container-template.yml | 1 + 1 file changed, 1 insertion(+) -- 2.35.3
On Thu, Feb 23, 2023 at 11:21:53AM -0300, Fabiano Rosas wrote: > I'm not sure if this was discussed previously, but I noticed we're not > pulling the images we push to the registry at every pipeline run. > > I would expect we don't actually need to rebuild container images at > _every_ pipeline run, so I propose we add a "docker pull" to the > container templates. We already have that for the docker-edk2|opensbi > images. > > Some containers can take a long time to build (14 mins) and pulling > the image first without building can cut the time to about 3 > mins. With this we can save almost 2h of cumulative CI time per > pipeline run: The docker.py script that we're invoking is already pulling the image itself eg to pick a random recent job: https://gitlab.com/qemu-project/qemu/-/jobs/3806090058 We can see $ ./tests/docker/docker.py --engine docker build -t "qemu/$NAME" -f "tests/docker/dockerfiles/$NAME.docker" -r $CI_REGISTRY/qemu-project/qemu 03:54 Using default tag: latest latest: Pulling from qemu-project/qemu/qemu/debian-arm64-cross bb263680fed1: Pulling fs layer ...snip... none the less it still went ahead and rebuilt the image from scratch so something is going wrong here. I don't know why your change adding an extra 'docker pull' would have any effect, given we're already pulling, so I wonder if that's just coincidental apparent change due to the initial state of your fork's container registery. Whenever I look at this I end up wishing out docker.py didn't exist and that we could just directly do - docker pull "$TAG" - docker build --cache-from "$TAG" --tag "$TAG" -f "tests/docker/$NAME.docker" as that sould be sufficient to build the image with caching. > We would need to devise a mechanism (not included here) to force the > re-build of the container images when needed, perhaps an environment > variable or even a whole new "container build" stage before the > "container" stage. > > What do you think? We definitely want the rebuild to be cached. So whatever is broken in that regard needs fixing, as this used to work AFAIK. Ideally we would skip the container stage entirely for any pull request that did NOT include changes to the dockerfile. The problem is that the way we're using gitlab doesn't let that work well. We need to setup rules based on filepath. Such rules are totally unreliable for push events in practice, because they only evaluate the delta between what you just pushed and what was already available on the server. This does not match the content of the pull request, it might be just a subset. If we had subsystem maintainers opening a merge request for their submission, then we could reliably write rules based on what files are changed by the pull request, and entirely skip the containers stage most of the time, which would be an even bigger saving. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Daniel P. Berrangé <berrange@redhat.com> writes: > On Thu, Feb 23, 2023 at 11:21:53AM -0300, Fabiano Rosas wrote: >> I'm not sure if this was discussed previously, but I noticed we're not >> pulling the images we push to the registry at every pipeline run. >> >> I would expect we don't actually need to rebuild container images at >> _every_ pipeline run, so I propose we add a "docker pull" to the >> container templates. We already have that for the docker-edk2|opensbi >> images. >> >> Some containers can take a long time to build (14 mins) and pulling >> the image first without building can cut the time to about 3 >> mins. With this we can save almost 2h of cumulative CI time per >> pipeline run: > > The docker.py script that we're invoking is already pulling the > image itself eg to pick a random recent job: > > https://gitlab.com/qemu-project/qemu/-/jobs/3806090058 > > We can see > > $ ./tests/docker/docker.py --engine docker build -t "qemu/$NAME" -f "tests/docker/dockerfiles/$NAME.docker" -r $CI_REGISTRY/qemu-project/qemu 03:54 > Using default tag: latest > latest: Pulling from qemu-project/qemu/qemu/debian-arm64-cross > bb263680fed1: Pulling fs layer > ...snip... Ah right, so this is different for user's pipelines because the push at the end of the build goes to the user's registry: registry.gitlab.com/farosas/qemu/qemu/debian-arm64-cross So we're fetching from one place and pushing to a different one. That is why I see the improvement. > none the less it still went ahead and rebuilt the image from scratch It seems docker.py does not see that we're trying to build a tag that is already there. Could this be due to --cache-from being "disabled"? if ("docker" in self._command and "TRAVIS" not in os.environ and "GITLAB_CI" not in os.environ): os.environ["DOCKER_BUILDKIT"] = "1" self._buildkit = True else: self._buildkit = False > so something is going wrong here. I don't know why your change adding > an extra 'docker pull' would have any effect, given we're already > pulling, so I wonder if that's just coincidental apparent change > due to the initial state of your fork's container registery. > > Whenever I look at this I end up wishing out docker.py didn't exist > and that we could just directly do > > - docker pull "$TAG" > - docker build --cache-from "$TAG" --tag "$TAG" -f "tests/docker/$NAME.docker" I see that in the docker-openbsi image we do just that. > as that sould be sufficient to build the image with caching. > >> We would need to devise a mechanism (not included here) to force the >> re-build of the container images when needed, perhaps an environment >> variable or even a whole new "container build" stage before the >> "container" stage. >> >> What do you think? > > We definitely want the rebuild to be cached. So whatever is > broken in that regard needs fixing, as this used to work AFAIK. > > > Ideally we would skip the container stage entirely for any > pull request that did NOT include changes to the dockerfile. Agreed.
Daniel P. Berrangé <berrange@redhat.com> writes: > On Thu, Feb 23, 2023 at 11:21:53AM -0300, Fabiano Rosas wrote: >> I'm not sure if this was discussed previously, but I noticed we're not >> pulling the images we push to the registry at every pipeline run. >> >> I would expect we don't actually need to rebuild container images at >> _every_ pipeline run, so I propose we add a "docker pull" to the >> container templates. We already have that for the docker-edk2|opensbi >> images. >> >> Some containers can take a long time to build (14 mins) and pulling >> the image first without building can cut the time to about 3 >> mins. With this we can save almost 2h of cumulative CI time per >> pipeline run: > > The docker.py script that we're invoking is already pulling the > image itself eg to pick a random recent job: > > https://gitlab.com/qemu-project/qemu/-/jobs/3806090058 > > We can see > > $ ./tests/docker/docker.py --engine docker build -t "qemu/$NAME" -f > "tests/docker/dockerfiles/$NAME.docker" -r > $CI_REGISTRY/qemu-project/qemu 03:54 > Using default tag: latest > latest: Pulling from qemu-project/qemu/qemu/debian-arm64-cross > bb263680fed1: Pulling fs layer > ...snip... > > none the less it still went ahead and rebuilt the image from scratch > so something is going wrong here. I don't know why your change adding > an extra 'docker pull' would have any effect, given we're already > pulling, so I wonder if that's just coincidental apparent change > due to the initial state of your fork's container registery. > > Whenever I look at this I end up wishing out docker.py didn't exist > and that we could just directly do > > - docker pull "$TAG" > - docker build --cache-from "$TAG" --tag "$TAG" -f "tests/docker/$NAME.docker" > > as that sould be sufficient to build the image with caching. I think we should be ready to do that now as we have flattened all our dockerfiles. The only other thing that docker.py does is nicely add a final step for the current user so you can ensure all files generated in docker cross compile images are still readable on the host. >> We would need to devise a mechanism (not included here) to force the >> re-build of the container images when needed, perhaps an environment >> variable or even a whole new "container build" stage before the >> "container" stage. >> >> What do you think? > > We definitely want the rebuild to be cached. So whatever is > broken in that regard needs fixing, as this used to work AFAIK. > > > Ideally we would skip the container stage entirely for any > pull request that did NOT include changes to the dockerfile. That would be ideal. > The problem is that the way we're using gitlab doesn't let > that work well. We need to setup rules based on filepath. > Such rules are totally unreliable for push events in > practice, because they only evaluate the delta between what > you just pushed and what was already available on the server. > This does not match the content of the pull request, it might > be just a subset. > > If we had subsystem maintainers opening a merge request for > their submission, then we could reliably write rules based > on what files are changed by the pull request, and entirely > skip the containers stage most of the time, which would be > an even bigger saving. Our first tentative steps away from an email process? -- Alex Bennée Virtualisation Tech Lead @ Linaro
Hi Alex, > Daniel P. Berrangé <berrange@redhat.com> writes: > >> On Thu, Feb 23, 2023 at 11:21:53AM -0300, Fabiano Rosas wrote: >>> I'm not sure if this was discussed previously, but I noticed we're not >>> pulling the images we push to the registry at every pipeline run. >>> >>> I would expect we don't actually need to rebuild container images at >>> _every_ pipeline run, so I propose we add a "docker pull" to the >>> container templates. We already have that for the docker-edk2|opensbi >>> images. >>> >>> Some containers can take a long time to build (14 mins) and pulling >>> the image first without building can cut the time to about 3 >>> mins. With this we can save almost 2h of cumulative CI time per >>> pipeline run: >> >> The docker.py script that we're invoking is already pulling the >> image itself eg to pick a random recent job: >> >> https://gitlab.com/qemu-project/qemu/-/jobs/3806090058 >> >> We can see >> >> $ ./tests/docker/docker.py --engine docker build -t "qemu/$NAME" -f >> "tests/docker/dockerfiles/$NAME.docker" -r >> $CI_REGISTRY/qemu-project/qemu 03:54 >> Using default tag: latest >> latest: Pulling from qemu-project/qemu/qemu/debian-arm64-cross >> bb263680fed1: Pulling fs layer >> ...snip... >> >> none the less it still went ahead and rebuilt the image from scratch >> so something is going wrong here. I don't know why your change adding >> an extra 'docker pull' would have any effect, given we're already >> pulling, so I wonder if that's just coincidental apparent change >> due to the initial state of your fork's container registery. >> >> Whenever I look at this I end up wishing out docker.py didn't exist >> and that we could just directly do >> >> - docker pull "$TAG" >> - docker build --cache-from "$TAG" --tag "$TAG" -f "tests/docker/$NAME.docker" >> >> as that sould be sufficient to build the image with caching. > > I think we should be ready to do that now as we have flattened all our > dockerfiles. The only other thing that docker.py does is nicely add a > final step for the current user so you can ensure all files generated in > docker cross compile images are still readable on the host. > Just so you know this command line worked: docker build --cache-from $TAG --tag $TAG --build-arg BUILDKIT_INLINE_CACHE=1 \ -f "tests/docker/dockerfiles/$NAME.docker" "." building the cache: https://gitlab.com/farosas/qemu/-/jobs/3825838177 using the cache: https://gitlab.com/farosas/qemu/-/jobs/3825926944 But we might still have this issue: commit 6ddc3dc7a882f2e7200fa7fecf505a8d0d8bbea9 Author: Daniel P. Berrangé <berrange@redhat.com> Date: Fri Jul 9 15:29:35 2021 +0100 tests/docker: don't use BUILDKIT in GitLab either Using BUILDKIT breaks with certain container registries such as CentOS, with docker build reporting an error such as failed to solve with frontend dockerfile.v0: failed to build LLB: failed to load cache key: unexpected status code https://registry.centos.org/v2/centos/manifests/7: 403 Forbidden We might need to go the route of skipping the docker build when the docker pull succeeds.
On Thu, Feb 23, 2023 at 05:30:58PM -0300, Fabiano Rosas wrote: > > Hi Alex, > > > Daniel P. Berrangé <berrange@redhat.com> writes: > > > >> On Thu, Feb 23, 2023 at 11:21:53AM -0300, Fabiano Rosas wrote: > >>> I'm not sure if this was discussed previously, but I noticed we're not > >>> pulling the images we push to the registry at every pipeline run. > >>> > >>> I would expect we don't actually need to rebuild container images at > >>> _every_ pipeline run, so I propose we add a "docker pull" to the > >>> container templates. We already have that for the docker-edk2|opensbi > >>> images. > >>> > >>> Some containers can take a long time to build (14 mins) and pulling > >>> the image first without building can cut the time to about 3 > >>> mins. With this we can save almost 2h of cumulative CI time per > >>> pipeline run: > >> > >> The docker.py script that we're invoking is already pulling the > >> image itself eg to pick a random recent job: > >> > >> https://gitlab.com/qemu-project/qemu/-/jobs/3806090058 > >> > >> We can see > >> > >> $ ./tests/docker/docker.py --engine docker build -t "qemu/$NAME" -f > >> "tests/docker/dockerfiles/$NAME.docker" -r > >> $CI_REGISTRY/qemu-project/qemu 03:54 > >> Using default tag: latest > >> latest: Pulling from qemu-project/qemu/qemu/debian-arm64-cross > >> bb263680fed1: Pulling fs layer > >> ...snip... > >> > >> none the less it still went ahead and rebuilt the image from scratch > >> so something is going wrong here. I don't know why your change adding > >> an extra 'docker pull' would have any effect, given we're already > >> pulling, so I wonder if that's just coincidental apparent change > >> due to the initial state of your fork's container registery. > >> > >> Whenever I look at this I end up wishing out docker.py didn't exist > >> and that we could just directly do > >> > >> - docker pull "$TAG" > >> - docker build --cache-from "$TAG" --tag "$TAG" -f "tests/docker/$NAME.docker" > >> > >> as that sould be sufficient to build the image with caching. > > > > I think we should be ready to do that now as we have flattened all our > > dockerfiles. The only other thing that docker.py does is nicely add a > > final step for the current user so you can ensure all files generated in > > docker cross compile images are still readable on the host. > > > > Just so you know this command line worked: > > docker build --cache-from $TAG --tag $TAG --build-arg BUILDKIT_INLINE_CACHE=1 \ > -f "tests/docker/dockerfiles/$NAME.docker" "." > > building the cache: https://gitlab.com/farosas/qemu/-/jobs/3825838177 > using the cache: https://gitlab.com/farosas/qemu/-/jobs/3825926944 > > But we might still have this issue: > > commit 6ddc3dc7a882f2e7200fa7fecf505a8d0d8bbea9 > Author: Daniel P. Berrangé <berrange@redhat.com> > Date: Fri Jul 9 15:29:35 2021 +0100 > > tests/docker: don't use BUILDKIT in GitLab either > > Using BUILDKIT breaks with certain container registries such as CentOS, > with docker build reporting an error such as > > failed to solve with frontend dockerfile.v0: > failed to build LLB: failed to load cache key: > unexpected status code > https://registry.centos.org/v2/centos/manifests/7: > 403 Forbidden > > We might need to go the route of skipping the docker build when the > docker pull succeeds. I don't recall what made QEMU special in that respect, but we never set this variable in any libvirt project and didn't see any obvious problems. So I'm inclined to say it was something the docker.py does that's caused the problem With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2024 Red Hat, Inc.