Notes on Docker in Practice Part 1

A rare thing happened to me on Friday–I ran out of work. I didn’t run out in the sense that I sprinted to the elevator and down Jackson Boulevard. That happens with some frequency. I ran out of work in that I didn’t have much meaningful work to do. Even this is a bit of an exaggeration, but I was waiting on a few merge requests and the office had mostly vacated by mid afternoon. I checked out some of the single-serving O’Reilly books on DevOps, but then out of the corner of my eye I saw the Docker in Practice book on a colleague’s desk. I was a bit apprehensive to pick it up since it was published in 2016 (and thus potentially written in 2015), but I figured it would be the best use of my time. It turned out to be a great read. You can find my short review here.

The book is split up into 4 sections and 101 “techniques” for solving problems with Docker. I found myself really engaging with some of the techniques and writing a lot in my small notebook. I’ve never structured a post this way. What I’m about to do is highlight select techniques from the book accompanied by my own commentary.

Technique 4: Use socat to monitor Docker API traffic

Seeing Docker as a black box rather than a client-daemon with an API was the cause of many of my early follies. Immediately resorting to GitHub issue threads when encountering problems has prevented me from firming my grasp on the technology. Socat (SOcket CAT) is a useful tool that I didn’t know about until I read this technique. It allows you to inspect that API calls being made to the Docker socket. Since it can be used between any bidirectional byte streams, this tool is probably well known to network engineers.

The resulting output will include the HTTP request, the HTTP response, the JSON content, and then the output you would normally see. For running docker images, the results aren’t particularly interesting, but I can see socat becoming an invaluable debugging tool for me, especially while first digging into an issue.

A wise man once told me that the best way to learn Docker is by splitting up monoliths and practicing dockerizing your favorite libraries. He was right, which is why I still regard him as a wise man. While dockerizing libraries might just be a fun coding exercise, splitting up monoliths is a fun, useful application of Docker. By splitting a docker container into pure component parts and thus eliminating sources of bloat or bastardization, you're able to enjoy a more stable, more agile environment everywhere from dev to production.

This technique appears early in the book and doesn't cover the composition (ahem, docker-composing) of such a service, but I regard it as a fundamental that is crucial to understand before using docker anywhere.

Technique 27: Inspecting containers

As I mentioned in my Reverse Engineering Docker Run Commands post, there are a few ways you can substitute a lack of docker knowledge when you need to uncover more information about a container, but there is nothing like having a thorough understanding of docker inspect and what is stored in the container's metadata.

Casting hardcore grep and awk usage aside, the inspect command has a --format flag that can be used to easily parse fields in the returned json blob.

Raise your hand if you've been victimized by a layer in a Dockerfile build being cached.

The problem I've most often encountered is when I'm from a git repository, specifying a brand rather than a tag, and the Dockerfile not being any wiser just using the cache since the name hasn't changed though the underlying repo has. To resolve this, you can build images using the --no-cache flag. If you want to be more selective and not bust your cache every time you build, you can instead add in a single cache-busting line above your code prone to cache confusion and just add a few gibberish characters to that line each time you need to build. Alternatively, though this doesn't solve all issues, you can only build images from tagged versions of a repository. This has helped me prevent this embarrassing problem from rearing its ugly head in prod.

In Technique 32, the authors recommend inserting a single comment to bust a cache starting on a certain line. I regard this as a cleaner approach to the one that I advocated for above.

Technique 34: Housekeeping containers

The first time I appeared on the SysEng radar in my workplace as a potential threat was when I first started contributing to some of our software that used Docker and had no idea how to manage the containers and the space they gobbled up.

Rather than deleting all containers, there are more pragmatic and safe ways to clean up a lot of disk space without blowing up something important.

It's important to first clean up all containers, as this may create more images that can be cleaned up. The opposite is not true.

docker rm $(docker ps -qa --no-trunc --filter "status=exited")

This will get rid of all the containers not being used

docker rmi $(docker images --filter "dangling=true" -q --no-trunc)

This will get rid of all dangling images--thus having a potentially large impact on disk usage but no impact on anything currently running

Once you realize that the Docker daemon exposes an API, the wheels of your mind start turning. What if there was a graphical interface that allowed you to manage all of your containers and images? Turns out, there is. It used to be called DockerUI, but that merged with the Portainer. I've been using Portainer for a little over a month now and it's miraculous. The setup is easy because it's Dockerized. The full functionality--which I still have yet to use to its full potential, is a game changer. This is something I wish I discovered much sooner in my career.

Though the article in which I recommended pinning versions of Docker dependencies was mainly advocating for version pinning in general, it's an important lesson. The problem, plainly stated in this chapter, is that "you want to ensure that your pacakges are the versions you expect." This sounds innocuous, but I've been bitten by it in a big way. Once you introduce a bug with an automatically updated package version in a large project, it is a nightmare (or, in my words, a horror story) to try to track down and debug. Please consider pinning versions!

Technique 42: Reverse-engineer a Dockerfile from an image

Yes, this is possible. Similar to what I said about Technique 27, understanding the Docker inspect metadata is crucial. In this example, jq (like sed for JSON data) is used to parse the ContainerConfig fields and recreate the Dockerfile. Wow!

My response to this is to please think twice. I just recently was able to deprecate our bloated base Makefile at work and everyone sleeps a little better at night. It can get very confusing very quickly, and while I recognize it as a valid tool for certain use cases, I have trouble endorsing it. If you really, truly need addition tasks, then its totally reasonable to use. In many cases, though, I'd imagine people get a bit overzealous and focus their energy here instead of properly using docker-compose

Well, that's probably all you care to read about this topic for now. In part 2 I will cover techniques 49 - 101. Thanks for reading.