Caching containers to speed up your builds

- By Manisha Sahasrabudhe on April 02, 2015

Important update on this blog

IMPORTANT:

This blog is based on the old shippable.yml format. A built-in yml translator does translates the code from the old to the new format. Read more about the translation from the old to the new format here.

For the latest information, refer our documentation on caching and/or open a support issue, if you have questions.

 

We have had a lot of questions from customers recently about caching containers to speed up their builds. Some have questions about how it works and others question whether it works at all. This blog post will address both of these audiences.

How can you enable cache for your builds? 
The main purpose of enabling cache for your builds is to speed up your build times. Typically, you will install everything you need for your builds in the ‘before_install’ section of your yml. If caching is not enabled, this section will be executed each time, adding several minutes to your build process. Caching the container will avoid this tax for each build. You can enable cache by adding just one line in your shippable.yml -

cache: true

How does caching work?

Each build job is routed to one of our build hosts as chosen by our algorithm. When we detect that caching is turned on, we look for a previously cached image for that project on that host. If nothing is found, we run the build and then as a last step, we cache the entire container on that host. If we do find a previously cached image, we pull that image and run the new build on that container. We do not recache the latest image from the last build, but the originally cached image remains on the host.

This causes a few situations that aren’t immediately obvious -
- Since caching is implemented per host, it can take a few builds for the caching to be picked up for every build as more and more build hosts cache the image.
- If you add something to your before_install step or package.json after the container is already cached, it will not be cached on the container. This in because we only cache the container the first time cache=true is detected on a build host. If you add something new and want it to be a part of the cache, simply run a build with [reset minion] in the commit message and this will reset your cache and as new builds come in, the cache will be rebuilt and will pick up the new packages.

Why don’t we just cache the container each time?


We had to take this decision due to a Docker limitation of allowing only 127 aufs layers. Each time a container is cached, it adds a layer. This means if we cache each time, the first 127 builds will be successful but the 128th build on a host will fail because docker run will fail for a container with 127+ layers. To get around this limitation, we decided to cache only the first time since most people don’t change their before_install sections very often.

Why do we cache per host?

We’re working on a central caching implementation and this will be available in a few weeks. We wanted to unblock customers asking for caching so we implemented a quick solution to help speed up builds until we move to a more comprehensive solution.

Hope this helps clarify some questions around caching. If you have additional questions or concerns feel free to drop us a note.


Try Shippable

Topics: Docker, containers, how-to, features