GitHub Action `gha` Docker cache much slower than recreating the image

1k views Asked by At

I set a GitHub actions to build and run tests on my Docker image. While building the Docker image, a rather heavy file (6Gb model from Huggingface) is being downloaded. (As a side note - I can possibly bring it down to 2Gb because the weights are downloaded three times for a silly reason.) I thought I could speed up things by using the gha cache. The cache works, but it's much much slower.

Here's the gist of my setup

Non-caching GitHub action

      - name: Build Image
        shell: bash
        run: |
          docker buildx build -t ${IMAGE_TAG} -f ./Dockerfile .

Takes 3m 23s

Caching GitHub action

      - name: Build Image
        uses: docker/build-push-action@v5
        with:
          push: false  # do not push to remote registry
          load: true  # keep in local Docker registry
          context: .
          file: ./Dockerfile
          tags: ${{env.IMAGE_TAG}}
          cache-from: type=gha
          cache-to: type=gha,mode=max

Takes 7m 58s (initial build when performing first commit with new setup was 12m 53s).

Downloading 6Gb from huggingface takes about 30s. Downloading a 4Gb image from GitHub itself takes 279s.

Is this a GitHub problem? Is there any way to get around it?

This may be related to this question.

EDIT: apparently I'm not the only one suffering from this - issue on Docker's GitHub

2

There are 2 answers

5
VonC On

From the issue you mention, it seems you would need to wait for a docker buildx release to fix this.

Check if you were using the buildx v0.12.0, which might help.

In the meantime (pending a new buildx release), you would need to avoid the --load option, since it is causing significant slowdowns.
And consider using the GitHub Container Registry for certain operations if it offers better performance.


Can you suggest any other way of achieving with without using the default docker-container driver with --load?

You might consider using a local registry for caching: that approach involves pushing the built image to a local registry in one job and pulling it from there in subsequent jobs. That can be faster than using GitHub's cache system and avoids the performance issues with --load.

+------------------------+     +---------------------------+     +------------------------+
|  Start Local Registry  |     | Build and Push to Local   |     | Pull from Local        |
|                        |     | Registry                  |     | Registry in Subsequent |
|  docker run registry   |     |  docker build & push      |     | Job                    |
|         │              |     |         │                 |     |         │              |
|         ▼              |     |         ▼                 |     |         ▼              |
|  Local Docker Registry |     |  localhost:5000/my-image  |     | docker pull            |
+------------------------+     +---------------------------+     +------------------------+

In your first job, start a local Docker registry container.

- name: Start Local Registry
    run: docker run -d -p 5000:5000 --name registry registry:2

Then you would build your Docker image and push it to the local registry:

- name: Build and Push to Local Registry
    run: |
    docker build -t localhost:5000/my-image .
    docker push localhost:5000/my-image

In subsequent jobs, pull the image from the local registry.

- name: Pull from Local Registry
    run: docker pull localhost:5000/my-image

Another approach:

Directly saving and loading images. Instead of using the --load option, you can save the Docker image as a tarball and cache the tarball using GitHub's caching. In the subsequent job, you can restore the tarball from the cache and load it into Docker. That method might be more efficient than using the docker-container driver.

+-----------------------+     +-----------------------+     +--------------------------+
|  Build and Save Image |     | Cache the Tarball     |     | Load Image in Subsequent |
|                       |     |                       |     | Job                      |
|  docker build & save  |     |  actions/cache        |     |  docker load             |
|         │             |     |         │             |     |         │                |
|         ▼             |     |         ▼             |     |         ▼                |
|  my-image.tar         |     |  Cache my-image.tar   |     |  Restore & Load Image    |
+-----------------------+     +-----------------------+     +--------------------------+

Build your Docker image and save it as a tarball.

- name: Build and Save Image
    run: |
    docker build -t my-image .
    docker save my-image > my-image.tar

Use GitHub Actions' cache to store the tarball.

- uses: actions/cache@v2
    with:
    path: my-image.tar
    key: ${{ runner.os }}-my-image-${{ hashFiles('**/Dockerfile') }}

Restore the tarball from the cache and load it into Docker.

- name: Load Image from Tarball
    run: |
    if [ -f my-image.tar ]; then
        docker load < my-image.tar
    fi

It seems this also saves a TAR file and loads it which is exactly what --load does. The use of Github cache directly is interesting but you build the image anyway.

True, both the methods essentially involve saving and loading a TAR file, which is similar in concept to what the --load option does in Docker. However, the key difference lies in how and where the TAR file is managed:

  • Local registry approach involves pushing and pulling the image to/from a local Docker registry running within the GitHub Actions environment. It avoids some of the I/O overhead associated with --load by directly interacting with a local registry.

  • Direct TAR file caching uses GitHub's own caching mechanism to store and retrieve the TAR file. While it still involves building and saving the image as a TAR file, it potentially offers more control over the caching process compared to using --load.

In both cases, the goal is to bypass some of the performance bottlenecks associated with the docker-container driver and --load, but they still fundamentally rely on a similar mechanism of saving and loading Docker images.

0
Mohamd Imran On

The slowdown you're encountering in GitHub Actions with Docker caching could be due to a few factors. If your cache size is exceeding GitHub Actions' limits, it might slow down the cache retrieval process. Ensure your cache size is within the recommended range. low cache hit rate, where the cache is frequently invalidated or doesn't match the expected layers, can lead to slower builds as Docker has to fetch more data. Periodically clean up old or unnecessary layers from your Docker image to reduce the cache size and improve retrieval times. Experiment with parallelizing your build steps to see if that improves overall build times. GitHub Actions supports parallel jobs. Or Try different caching strategies, such as a warm cache approach where you regularly update the cache in a scheduled manner.

Another Approach for achieving your caching :

If you want to use caching and share the Docker image across subsequent GitHub job steps without relying on the default docker-container driver with --load, you can consider using GitHub Container Registry to store and retrieve your Docker image. You can Build and Push to GitHub Container Registry:

First Build your Docker image. Tag the image with a version or a unique identifier. Push those images to GitHub Container Registry.

E.G.

- name: Build and Push Image
  run: |
    docker build -t ${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }} .
    echo ${{ secrets.GITHUB_TOKEN }} | docker login ghcr.io -u ${{ github.actor }} --password-stdin
    docker push ghcr.io/${{ github.repository_owner }}/${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}

After Build And Push , Use the Cached Image in Subsequent Steps ,use the cached Docker image directly from GitHub Container Registry instead of relying on local caching.

E.G.

- name: Use Cached Image
  run: |
    docker pull ghcr.io/${{ github.repository_owner }}/${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}
    # Continue with subsequent steps using the pulled image

This way, you leverage GitHub Container Registry to store and retrieve your Docker image across jobs, ensuring consistency and reproducibility . Hope I've answered your question and Clearfield the good approach to solve the caching issue