.NET and MultiStage Dockerfiles

A while back I talked about building optimized docker images. (Building Optimized Docker Images with ASP.NET Core) With compiled runtimes like Go, Java and .NET, you'll want to first compile your code before having a binary that can be run. The components required to compile your code are not required to run your code. And the SDKs can be quite big, not to mention any potential attack surface area. The best practice has been to build docker images in steps. This could have been done on your local machine, taking the output and placing it in a container. Or, create solutions like we did with Visual Studio, creating a docker-compose.ci-builld.yml file to first compile the code.

The challenge with building on the host, including hosted build agents is we must first have a build agent with everything we need, including the specific versions. If your dev shop has any history of .NET Apps, you'll likely have multiple versions to maintain. Which means you have complex agents to deal with the complexities. We could refer to these as "Pets".

One of the big benefits of Docker is treating our build environments as "cattle". The build agents only need to know how to run docker. They have no need for .NET, Node, bower, gulp or any other build tools. The build environment can be specified with our source. This completely empowers the dev team of each project to determine and provide a build environment specific to their needs. The maintainers of your build environment need not know anything about the current versions or runtimes. They simply maintain generic cattle farms that know how to run docker, aggregate the build output logs and report on the build status.

Lets take a look at a multi-stage dockerfile:
FROM microsoft/aspnetcore:2.0 AS base
WORKDIR /app
EXPOSE 80

At first, it simply looks like several dockerfiles stitched together. Multi-stage Dockerfiles can be layered or inherited. When you look closer, there are a couple of key things to realize.
Notice the 3rd stage

FROM builder AS publish

builder isn't an image pulled from a registry. It's the image we defined in stage 2, where we named the result of our our -build (sdk) image "builder". Docker build will create a named image we can later reference.

We can also copy the output from one image to another. This is the real power to compile our code with one base sdk image (microsoft/aspnetcore-build) , while creating a production image, based on an optimized runtime image. (microsoft/aspnetcore). Notice the line
COPY --from=publish /app .

This takes the /app directory from the publish image, and copies it to the working directory of the production image.

Breakdown Of Stages

The first stage provides the base of our optimized runtime image. Notice it derives from microsoft/aspnetcore. This is where we'd specify additional production configurations, such as registry configurations, MSIexec of additional components,... Any of those environment configurations you would hand off to your ops folks to prepare the VM.
The second stage is our build environment. microsoft/aspnetcore-build This includes everything we need to compile our code. From here, we have compiled binaries we can publish, or test. More on testing in a moment.

The 3rd stage derives from our builder. It takes the compiled output and "publishes" them, in .NET terms. Publishing simply means take all the output required to deploy your "app/service/component" and place it in a single directory. This would include your compiled binaries, graphics (images), javascript, etc.

The 4th stage is taking the published output, and placing it in the optimized image we defined in the first stage.

Why Is Publish Separate From Build?

You'll likely want to run unit tests to verify your compiled code, or the aggregate of the compiled code from multiple developers being merged together, continues to function as expected. To run unit tests, you could place the following stage between builder and publish.
FROM builder AS test
WORKDIR /src/Web.test
RUN dotnet test

If your tests fail, the build will cease to continue.

Why Is Base First?

You could argue this is simply the logical flow. We first define the base runtime image. Get the compiled output ready, and place it in the base image. However, it's more practical. While debugging your applications under Visual Studio Container Tools, VS will debug your code directly in the base image. When you hit F5, Visual Studio will compile the code on your dev machine. It will then volume mount the output to the built runtime image; the first stage. This way you can test any configurations you've made to your production image, such as registry configurations or otherwise.
When docker build --target base is executed, docker starts processing the dockerfile from the beginning, through the stage (target) defined. Since base is the first stage, we take the shortest path, making the F5 experience as fast as possible. If base was after compilation (builder), you'd have to wait for all the subsequent steps to complete. One of the perf optimizations we make with VS Container Tools is to take advantage of the Visual Studio compilations on your dev machine.

A Closer Look at Multiple Projects and Solutions

The multi-stage dockerfile above is based on a Visual Studio solution. The full example can be found in this github repo representing a Visual Studio solution with a Web and API project. The additional unit tests are under the AddingUnitTests branch.

The challenge with solutions is they represent a collection of projects. We often think of dockerfiles specific to a single image. While true, that single image may be the result of multiple "projects".

Consider the common pattern to develop shared dlls that may represent your data access layer, your logging component, your business logic, an authentication library, or a shipping calculation. The Web or API project may each reference these project(s). They each need to take the compiled output from those project and place them in the optimized image. This isn't to say we're building yet another monolithic application. There will certainly be additional services, such as checkout, authentication, profile management, communicating with the telco switch. But there's a balance. Microservices doesn't mean every shared piece of code is it's own service.

If we look at the solution, we'll notice a few key aspects:

Each project, which will represent a final docker image, has it's own multi-stage dockerfile

Shared component projects that are referenced by other resulting docker images do not have dockerfiles

Each dockerfile assumes it's context is the solution directory. This gives us the ability to copy in other projects

There's a docker-compose.yml in the root of the solution. This gives us a single file to build multiple images, as well as specify build parameters, such as the image name:tag

Multi.sln
docker-compose.yml
[Api]
Dockerfile
[Web]
Dockerfile

We can now build the solution with a single docker command. We'll use docker-compose as our compose file has our image names as well as the individual build defintions
version: '3'

Coming Into Port

With multi-stage dockerfiles, we can now encapsulate our entire build process. By setting the context to our solution root, we can build multiple images, or build and aggregate shared components into images. By including our build environment in our multi-stage dockerfile, the development team owns the requirements to build their code, helping the CI/CD team to maintain a cattle farm without having to maintain individual build environments.

The multi-stage dockerfiles provided will scaffolded by Visual Studio. As of this post, we're finalizing the release date, but hope to have it out soon. I'll present our Visual Studio tooling, including multi-stage dockefile support at an Ignite Pre-Con, so for those attending, I hope to see you there.
If you've got questions, thoughts, please let us know so we can incorporate the feedback into our Visual Studio tooling.
Thanks,
Steve