Logging with Docker — Part 1

Introduction

Logging is a central piece to almost every application. It is one of those things you want to keep around specially when things go wrong. Much and more have been say and done on this particular subject, however, today I want to focus on Docker and the options available when it comes to logging in the context of containerized application. Logging capabilities available in the Docker project are exposed in the form of drivers, which is very handy since one gets to choose how and where log messages should be shipped. As of the moment of writing this article there are five of them:

json-file: This is the default driver. Everything gets logged to a JSON-structured file

syslog: Ship logging information to a syslog server

journald: Write log messages to journald (journald is a logging service which comes with systemd)

This article is the first of a few ones centered around logging in Docker. We will take a rather in-depth look to the first two drivers json-file and syslog. We will see how they work and what their strengths and limitations are, etc.

The examples given throughout the article were created and tested using Docker v1.8 which introduces the gelf and fluentd drivers alone with a few nice additions to the syslog one (tagging, logging facility, etc). You can read more about it here.

NOTE: This article’s source code alone with the code used can be found on Github

Docker Logging 101

Before getting into the specifics of any of the available logging drivers, there are these questions you might have already asked yourself:

What messages are actually logged?

How do I get my messages shipped to the logging driver?

The answer to first question it’s actually dead easy: anything that gets written out to the container’s standard output/error will get logged. As to the answer to the second one, you don’t need to do much, at least not by default. When a new container is created, and provided that no logging specific options were passed at the moment of creating it, the docker engine will configure the default logging driver which is the json-file one. So if we put together 1 and 2 it’s clear that anything your application write to it’s standard output/error will get written in a json file. That simple.

Ok, that last one paragraph was too long, so enough wording and let’s see it in action. First we will create a simple python script which writes to both it’s stdout and stderr output:

This illustrates how messages written to stdout/stderr are logged and it’s actually available with the docker logs command. But wait, isn’t this the json-file driver? So where is this so called JSON file? Let’s inspect our container and see what we can find:

here we have indicated that we only want to keep 2 log files and the max allowed size for any given log file is 2k. You can verify that this is indeed happening by grabbing the location of the default log file:

The Syslog driver

The json-file driver is as simple as it gets and while it’s very useful for development purposes but it won’t do the job once the list of running containers starts to growth. This is not strictly related to performance issues but rather to the fact that as the number of containers and applications starts to increase it becomes a no men work to track logging information. Enter the syslog driver.

This is one of the many logging drivers which allows you to send messages to local or remote server by means of a network protocol. So lets get ourselves a syslog server …. running in a container of course ;):

which will totally make sense once you give it a 5 seconds thought. There is no possible way, nor it should try to, the docker logs could work as we expected given that messages are sent to a potentially remote server. Instead we need to look into the rsyslog container which is acting as our syslog server:

What’s next?

The syslog driver is an enhancement over the json-file one, specially for environments with multiple applications powered by containers, at the cost of introducing a dependency on a external service. It is also a step forward towards centralization of logging messages. However as the number of applications grow, and so it does the number of running containers, your will find your self in need of a tool letting you search for specific log messages based on a date/time range, a keyword, etc ((don’t grep-me on this one please ;)).

In the next article of these series we will take a look at the gelf and fluentd driver to see how they might help to overcome some of the issues mentioned above.