.gitlab-ci.d/container-template.yml | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-)
Enables caching from the qemu-project repository.
Uses a dedicated "$NAME-cache" tag for caching, to address limitations.
See issue "when using --cache=true, kaniko fail to push cache layer [...]":
https://github.com/GoogleContainerTools/kaniko/issues/1459
Does not specify a context since no Dockerfile is using COPY or ADD instructions.
Does not enable reproducible builds as
that results in builds failing with an out of memory error.
See issue "Using --reproducible loads entire image into memory":
https://github.com/GoogleContainerTools/kaniko/issues/862
Previous attempts, for the records:
- Alex Bennée: https://lore.kernel.org/qemu-devel/20230330101141.30199-12-alex.bennee@linaro.org/
- Camilla Conte (me): https://lore.kernel.org/qemu-devel/20230531150824.32349-6-cconte@redhat.com/
Signed-off-by: Camilla Conte <cconte@redhat.com>
---
.gitlab-ci.d/container-template.yml | 25 +++++++++++--------------
1 file changed, 11 insertions(+), 14 deletions(-)
diff --git a/.gitlab-ci.d/container-template.yml b/.gitlab-ci.d/container-template.yml
index 4eec72f383..066f253dd5 100644
--- a/.gitlab-ci.d/container-template.yml
+++ b/.gitlab-ci.d/container-template.yml
@@ -1,21 +1,18 @@
.container_job_template:
extends: .base_job_template
- image: docker:latest
stage: containers
- services:
- - docker:dind
+ image:
+ name: gcr.io/kaniko-project/executor:debug
+ entrypoint: [""]
+ variables:
+ DOCKERFILE: "$CI_PROJECT_DIR/tests/docker/dockerfiles/$NAME.docker"
+ CACHE_REPO: "$CI_REGISTRY/qemu-project/qemu/qemu/$NAME-cache"
before_script:
- export TAG="$CI_REGISTRY_IMAGE/qemu/$NAME:$QEMU_CI_CONTAINER_TAG"
- # Always ':latest' because we always use upstream as a common cache source
- - export COMMON_TAG="$CI_REGISTRY/qemu-project/qemu/qemu/$NAME:latest"
- - docker login $CI_REGISTRY -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD"
- - until docker info; do sleep 1; done
script:
- echo "TAG:$TAG"
- - echo "COMMON_TAG:$COMMON_TAG"
- - docker build --tag "$TAG" --cache-from "$TAG" --cache-from "$COMMON_TAG"
- --build-arg BUILDKIT_INLINE_CACHE=1
- -f "tests/docker/dockerfiles/$NAME.docker" "."
- - docker push "$TAG"
- after_script:
- - docker logout
+ - /kaniko/executor
+ --dockerfile "$DOCKERFILE"
+ --destination "$TAG"
+ --cache=true
+ --cache-repo="$CACHE_REPO"
--
2.45.0
On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote: > Enables caching from the qemu-project repository. > > Uses a dedicated "$NAME-cache" tag for caching, to address limitations. > See issue "when using --cache=true, kaniko fail to push cache layer [...]": > https://github.com/GoogleContainerTools/kaniko/issues/1459 After investigating, this is a result of a different design approach for caching in kaniko. In docker, it can leverage any existing image as a cache source, reusing individual layers that were present. IOW, there's no difference between a cache and a final image, they're one and the same thing In kaniko, the cache is a distinct object type. IIUC, it is not populated with the individual layers, instead it has a custom format for storing the cached content. Therefore the concept of storing the cache at the same location as the final image, is completely inappropriate - you can't store two completely different kinds of content at the same place. That is also why you can't just "git pull" the fetch the cache image(s) beforehand, and also why it doesn't look like you can use multiple cache sources with kaniko. None of this is inherantly a bad thing..... except when it comes to data storage. By using Kaniko we would, at minimum, doubling the amount of data storage we consume in the gitlab registry. This is a potentially significant concern because GitLab does technically have a limited storage quota, even with our free OSS plan subscription. Due to technical limitations, they've never been able to actually enforce it thus far, but one day they probably will. At which point we're doomed, because even with our current Docker-in-Docker setup I believe we're exceeding our quota. Thus the idea of doubling our container storage usage is pretty unappealing. We can avoid that by running without cache, but that has the cost of increasing the job running time, since all containers would be rebuilt on every pipeline. This will burn through our Azure compute allowance more quickly (or our GitLab CI credits if we had to switch away from Azure). > Does not specify a context since no Dockerfile is using COPY or ADD instructions. > > Does not enable reproducible builds as > that results in builds failing with an out of memory error. > See issue "Using --reproducible loads entire image into memory": > https://github.com/GoogleContainerTools/kaniko/issues/862 > > Previous attempts, for the records: > - Alex Bennée: https://lore.kernel.org/qemu-devel/20230330101141.30199-12-alex.bennee@linaro.org/ > - Camilla Conte (me): https://lore.kernel.org/qemu-devel/20230531150824.32349-6-cconte@redhat.com/ > > Signed-off-by: Camilla Conte <cconte@redhat.com> > --- > .gitlab-ci.d/container-template.yml | 25 +++++++++++-------------- > 1 file changed, 11 insertions(+), 14 deletions(-) > > diff --git a/.gitlab-ci.d/container-template.yml b/.gitlab-ci.d/container-template.yml > index 4eec72f383..066f253dd5 100644 > --- a/.gitlab-ci.d/container-template.yml > +++ b/.gitlab-ci.d/container-template.yml > @@ -1,21 +1,18 @@ > .container_job_template: > extends: .base_job_template > - image: docker:latest > stage: containers > - services: > - - docker:dind > + image: > + name: gcr.io/kaniko-project/executor:debug > + entrypoint: [""] > + variables: > + DOCKERFILE: "$CI_PROJECT_DIR/tests/docker/dockerfiles/$NAME.docker" > + CACHE_REPO: "$CI_REGISTRY/qemu-project/qemu/qemu/$NAME-cache" > before_script: > - export TAG="$CI_REGISTRY_IMAGE/qemu/$NAME:$QEMU_CI_CONTAINER_TAG" > - # Always ':latest' because we always use upstream as a common cache source > - - export COMMON_TAG="$CI_REGISTRY/qemu-project/qemu/qemu/$NAME:latest" > - - docker login $CI_REGISTRY -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" > - - until docker info; do sleep 1; done > script: > - echo "TAG:$TAG" > - - echo "COMMON_TAG:$COMMON_TAG" > - - docker build --tag "$TAG" --cache-from "$TAG" --cache-from "$COMMON_TAG" > - --build-arg BUILDKIT_INLINE_CACHE=1 > - -f "tests/docker/dockerfiles/$NAME.docker" "." > - - docker push "$TAG" > - after_script: > - - docker logout > + - /kaniko/executor > + --dockerfile "$DOCKERFILE" > + --destination "$TAG" > + --cache=true > + --cache-repo="$CACHE_REPO" I'm surprised there is no need to set provide the user/password login credentials for the registry. None the less I tested this and it succeeed. I guess gitlab somehow has some magic authorization granted to any CI job, that avoids the need for a manual login ? Wonder why we needed the 'docker login' step though ? Perhaps because D-in-D results in using an externally running docker daemon which didn't inherit credentials from the job environment ? Caching of course fails when I'm running jobs in my fork. IOW, if we change container content in a fork and want to test it, it will be doing a full build from scratch every time. This likely isn't the end of the world because dockerfiles change in frequently, and when they do, paying the price of full rebuild is a time limited proble unless a PULL is sent and accepted. TL;DR: functionally this patch is capable of working. The key downside is that it doubles our storage usage. I'm not convinced Kaniko offers a compelling enough benefit to justify this penalty. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, May 16, 2024 at 07:24:04PM +0100, Daniel P. Berrangé wrote: > On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote: > > Enables caching from the qemu-project repository. > > > > Uses a dedicated "$NAME-cache" tag for caching, to address limitations. > > See issue "when using --cache=true, kaniko fail to push cache layer [...]": > > https://github.com/GoogleContainerTools/kaniko/issues/1459 > > After investigating, this is a result of a different design approach > for caching in kaniko. > > In docker, it can leverage any existing image as a cache source, > reusing individual layers that were present. IOW, there's no > difference between a cache and a final image, they're one and the > same thing > > In kaniko, the cache is a distinct object type. IIUC, it is not > populated with the individual layers, instead it has a custom > format for storing the cached content. Therefore the concept of > storing the cache at the same location as the final image, is > completely inappropriate - you can't store two completely different > kinds of content at the same place. > > That is also why you can't just "git pull" the fetch the cache > image(s) beforehand, and also why it doesn't look like you can > use multiple cache sources with kaniko. > > None of this is inherantly a bad thing..... except when it comes > to data storage. By using Kaniko we would, at minimum, doubling > the amount of data storage we consume in the gitlab registry. Double is actually just the initial case. The cache is storing layers using docker tags, whose name appears based on a hash of the "RUN" command. IOW, the first time we build a container we have double the usage. When a dockerfile is updated changing a 'RUN' command, we now have triple the storage usage for cache. Update the RUN command again, and we now have quadruple the storage. etc. Kaniko does not appear to purge cache entries itself, and will rely on something else to do the cache purging. GitLab has support for purging old docker tags, but I'm not an admin on the QEMU project namespace, so can't tell if it can be enabled or not ? Many older projects have this permanently disabled due to historical compat issues in gitlab after they introduced the feature. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Fri, May 17, 2024 at 9:14 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Thu, May 16, 2024 at 07:24:04PM +0100, Daniel P. Berrangé wrote: > > On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote: > > > Enables caching from the qemu-project repository. > > > > > > Uses a dedicated "$NAME-cache" tag for caching, to address limitations. > > > See issue "when using --cache=true, kaniko fail to push cache layer [...]": > > > https://github.com/GoogleContainerTools/kaniko/issues/1459 > > > > After investigating, this is a result of a different design approach > > for caching in kaniko. > > > > In docker, it can leverage any existing image as a cache source, > > reusing individual layers that were present. IOW, there's no > > difference between a cache and a final image, they're one and the > > same thing > > > > In kaniko, the cache is a distinct object type. IIUC, it is not > > populated with the individual layers, instead it has a custom > > format for storing the cached content. Therefore the concept of > > storing the cache at the same location as the final image, is > > completely inappropriate - you can't store two completely different > > kinds of content at the same place. > > > > That is also why you can't just "git pull" the fetch the cache > > image(s) beforehand, and also why it doesn't look like you can > > use multiple cache sources with kaniko. > > > > None of this is inherantly a bad thing..... except when it comes > > to data storage. By using Kaniko we would, at minimum, doubling > > the amount of data storage we consume in the gitlab registry. > > Double is actually just the initial case. The cache is storing layers > using docker tags, whose name appears based on a hash of the "RUN" > command. > > IOW, the first time we build a container we have double the usage. > When a dockerfile is updated changing a 'RUN' command, we now have > triple the storage usage for cache. Update the RUN command again, > and we now have quadruple the storage. etc. > > Kaniko does not appear to purge cache entries itself, and will rely > on something else to do the cache purging. > > GitLab has support for purging old docker tags, but I'm not an > admin on the QEMU project namespace, so can't tell if it can be > enabled or not ? Many older projects have this permanently disabled > due to historical compat issues in gitlab after they introduced the > feature. I'm pretty sure purging can be enabled. Gitlab itself proposes this with a "set up cleanup" link on the registry page (1). Can you recall what issues they were experiencing? If this is the only issue blocking Kaniko adoption, and we can't solve it by enabling the cleanup, I can write an additional step at the end of the container build to explicitly remove old cache tags. (1) https://gitlab.com/qemu-project/qemu/container_registry > > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| >
On Mon, May 20, 2024 at 05:56:46PM +0100, Camilla Conte wrote: > On Fri, May 17, 2024 at 9:14 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > > > On Thu, May 16, 2024 at 07:24:04PM +0100, Daniel P. Berrangé wrote: > > > On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote: > > > > Enables caching from the qemu-project repository. > > > > > > > > Uses a dedicated "$NAME-cache" tag for caching, to address limitations. > > > > See issue "when using --cache=true, kaniko fail to push cache layer [...]": > > > > https://github.com/GoogleContainerTools/kaniko/issues/1459 > > > > > > After investigating, this is a result of a different design approach > > > for caching in kaniko. > > > > > > In docker, it can leverage any existing image as a cache source, > > > reusing individual layers that were present. IOW, there's no > > > difference between a cache and a final image, they're one and the > > > same thing > > > > > > In kaniko, the cache is a distinct object type. IIUC, it is not > > > populated with the individual layers, instead it has a custom > > > format for storing the cached content. Therefore the concept of > > > storing the cache at the same location as the final image, is > > > completely inappropriate - you can't store two completely different > > > kinds of content at the same place. > > > > > > That is also why you can't just "git pull" the fetch the cache > > > image(s) beforehand, and also why it doesn't look like you can > > > use multiple cache sources with kaniko. > > > > > > None of this is inherantly a bad thing..... except when it comes > > > to data storage. By using Kaniko we would, at minimum, doubling > > > the amount of data storage we consume in the gitlab registry. > > > > Double is actually just the initial case. The cache is storing layers > > using docker tags, whose name appears based on a hash of the "RUN" > > command. > > > > IOW, the first time we build a container we have double the usage. > > When a dockerfile is updated changing a 'RUN' command, we now have > > triple the storage usage for cache. Update the RUN command again, > > and we now have quadruple the storage. etc. > > > > Kaniko does not appear to purge cache entries itself, and will rely > > on something else to do the cache purging. > > > > GitLab has support for purging old docker tags, but I'm not an > > admin on the QEMU project namespace, so can't tell if it can be > > enabled or not ? Many older projects have this permanently disabled > > due to historical compat issues in gitlab after they introduced the > > feature. > > I'm pretty sure purging can be enabled. Gitlab itself proposes this > with a "set up cleanup" link on the registry page (1). > Can you recall what issues they were experiencing? Looks like they may have finally fixed the issue in gitlab. They have previously blocked cleanup on all repositories older than a certain date > If this is the only issue blocking Kaniko adoption, and we can't solve > it by enabling the cleanup, I can write an additional step at the end > of the container build to explicitly remove old cache tags. Cleanup stops the container usage growing without bound, but switching to Kaniko will still double our long term storage usage which is pretty undesirable IMHO. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 16/05/2024 20.24, Daniel P. Berrangé wrote: > On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote: >> Enables caching from the qemu-project repository. >> >> Uses a dedicated "$NAME-cache" tag for caching, to address limitations. >> See issue "when using --cache=true, kaniko fail to push cache layer [...]": >> https://github.com/GoogleContainerTools/kaniko/issues/1459 ... > TL;DR: functionally this patch is capable of working. The key downside > is that it doubles our storage usage. I'm not convinced Kaniko offers > a compelling enough benefit to justify this penalty. Will this patch fix the issues that we are currently seeing with the k8s runners not working in the upstream CI? If so, I think that would be enough benefit, wouldn't it? Thomas
On Fri, May 17, 2024 at 08:24:44AM +0200, Thomas Huth wrote: > On 16/05/2024 20.24, Daniel P. Berrangé wrote: > > On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote: > > > Enables caching from the qemu-project repository. > > > > > > Uses a dedicated "$NAME-cache" tag for caching, to address limitations. > > > See issue "when using --cache=true, kaniko fail to push cache layer [...]": > > > https://github.com/GoogleContainerTools/kaniko/issues/1459 > ... > > TL;DR: functionally this patch is capable of working. The key downside > > is that it doubles our storage usage. I'm not convinced Kaniko offers > > a compelling enough benefit to justify this penalty. > > Will this patch fix the issues that we are currently seeing with the k8s > runners not working in the upstream CI? If so, I think that would be enough > benefit, wouldn't it? Paolo said on IRC that he has reverted the changes to the runner which caused us problems. Docker in Docker is still a documented & supported option for GitLab AFAICT, so I would hope we can keep using it as before. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2024 Red Hat, Inc.