Docker Multistage Build
At first, I assume that you know and use docker. Nowadays, Maybe using docker in a production environment is controversial, but it is the best convenient for the development and deployment stages. Meanwhile, in my opinion, Depending on the project, docker is quite successful in production generally., Nevermind, I don’t want to prove or defend docker, she doesn’t need anyway :)
We will focus on build operations right now. In this context, we will be talking about Multistage Build which aims to reduce the cost in terms of image capacity. Multistage Build also supports creating clean images and so manage easily. So we can get rid of build dependencies and execute a more rapid building process.
Layer Caching
Actually, Cacheing is not this context subject but I cannot pass without mention it. You know Docker images contain layers. Docker Layer Caching (DLC) can reduce Docker image build times. Not affect the image size but build time is another important aspect.
Some instruction creates a Layer in Dockerfile and caching layers allow unnecessary compilation steps to be run over and over again. Docker Layer Caching mainly works on RUN, COPY, and ADD commands. But, you should be careful while writing Dockerfile. The order of the instructions used in the Dockerfile is important for caching. When the files change or the order of instructions changes in your Dockerfile, Cache will be broken after the step in which the change was made. Therefore, the ordering must be from the least changing to the most frequently changing step in the file content.
The layer cache is enabled default. When the ‘–no-cache’ option is passed to ‘Docker build…’, then that build will always start from scratch, writing a new image to the file system even if nothing in the Dockerfile has changed. This is guaranteed to not reuse stale results, but will always take the maximum amount of time.
What is Multistage?
Anyway, when we return our subject, Multistage Build is built to make Dockerfile files easier to read and manage. So It is an important tool for increasing productivity in Dockerfile. Multistage Build logic is based on using more than one base image by adding more than one “FROM” tag in Dockerfile file. Each “FROM” instruction will be evaluated in a different layer and will squash the previous “FROM” instructions. Thus, in the image to be produced, you can ignore the entire “FROM” load and copy and accept certain outputs at any point. At the end, your image is free of unnecessary layers and the images contained in those layers. In this way, you can reduce your image size.
Example Implementation
Basically what we are doing is using an intermediate container to compile our application. The “Dockerfile” I created is as follows:
FROM golang:latestENV BUILD_DIR /go/src/exampleRUN go get github.com/gin-gonic/ginWORKDIR “${BUILD_DIR}”COPY . “${BUILD_DIR}”RUN GOOS=linux go build -o example.app example.goCMD [“./example.app”]
This is a simple go-gin example. We have a file named example.go
I assumed you are in the directory that has the the two file “Dockerfile” and “example.go”. After running those command below in the directory. You can see the docker image with tag “latest” which is approximately 1G size.
docker build -f Dockerfile -t example .docker images
This image size so big to run a simple web application. Actually, we need a small docker image to run our binary. So we can use a “Alpine Linux” docker image.
Now, Please create a file named “Dockerfile-ms” file in the same directory as follows:
FROM golang:latestENV BUILD_DIR /go/src/exampleRUN go get github.com/gin-gonic/ginWORKDIR “${BUILD_DIR}”COPY . “${BUILD_DIR}”RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o example.app example.goFROM alpine:latestWORKDIR /COPY --from=0 /go/src/example/example.app /RUN chmod 755 /example.appCMD [“/example.app”]
After running those command below in the directory. You can see the docker image with tag “ms” which is approximately 35M size.
docker build -f Dockerfile-ms -t example:ms .
I added two FROM line. The First is for compiling, the second is for running. The base image will be the last “FROM”, So I choose a very small “Alpine Linux” image. This image will just run the application. Compiling and runnings are different stages. The main purpose here is to reduce the last docker image to need to upload your docker hub. When you try to upload these images to AWS ECR, you will understand how effective.
While working more bigger projects, probably you need to add lots of images by FROM instruction. As you realized, we used the option “ — from” in the COPY instruction in the file “Dockerfile-ms”. This option is for copying files from any image to another image. Yes right! You can copy files between images. The instruction “ — from=0” means that copy from the first image we added to Dockerfile by “FROM” instruction. However, you can use the alias instead of the index number. For this, you need to give the alias to images by using “as” keyword in “FROM” instruction.
FROM golang:latest as builder
Then you can use this alias in COPY instructions.
COPY — from=builder /go/src/example/example.app /
This method is more useful, because using an index may confuse you.
The final “Dockerfile-ms” file as follows:
FROM golang:latest as builderENV BUILD_DIR /go/src/exampleRUN go get github.com/gin-gonic/ginWORKDIR “${BUILD_DIR}”COPY . “${BUILD_DIR}”RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o example.app example.goFROM alpine:latestWORKDIR /COPY — from=builder /go/src/example/example.app /RUN chmod 755 /example.appCMD [“/example.app”]