Playing with Python Structured Logs

August 23, 2018

Recently I’ve been experimenting with the Python project structlog
to add structured logging to our in-house applications. The eventual plan
would be to emit the logs to an ELK stack with JSON parsing, instead of
the much more complicated set of rules we have to custom define for each
type of log file ingested today.

This post is my current understanding of structlog, reordered from how the
docs are arranged to make a bit more sense to me. If you’re familiar with
the library and have something to add–let me know,
I’d love to hear from you.

Basic Usage

Structlog is installed with pip install structlog, optionally adding
colorama for color output in a terminal.

At this point the logging usage is identical, but you can add key-value
pairs to the log calls.

log.info("Hello World", a=1)
# 2018-08-23 17:59.51 Hello World a=1

Already this is an improvement for tools like awk or sed.

Powerful Structured Logging

Configuring structlog

While the basic usage is great for getting started, it still has a lot
of training wheels attached.

The power of structlog comes from the concept of “processors”, which are
a pipeline of functions that modify the event – the event_dict in all
of the example code. Since event_dict is a dictionary, keys can be
modified as needed in each function before the next function is called.

The last step in the processor function chain is the Renderer function,
which takes in the event_dict and “renders” it appropriately for the
output handler. By default, and with most of the examples, this is
the logging library from Python’s stdlib. So effectively there’s an
arbitrary dict that gets massaged into the kwargs for logging.getLogger().log().

However, the basic usage comes with a bunch of processors that have to
be explicitly defined when processors is modified, since it’s a list
rather than a numerically ordered set of middlewares like Scrapy.

At this point I came to a screeching halt and had to take a step back
to understand what was going on.

Breaking Down The Processors

The processors in question do the normal pipeline of operations that logging
does, but I’ve never gone deep into logging except to know that:

logging will only output when the desired log level meets the logging level threshold.

I should use % formatted strings and pass the substitutions as args to log.log()

These are optimizations in the logging library to minimize unnecessary processing.
The first rule allows us to entirely skip DEBUG messages when we’re not logging
DEBUG. The second rule allows us to bypass string interpolation on those calls.

The structlog processor library is a mix of classes and functions. The
classes initially tripped me up with an error message about __init__
positional arguments before I realized not everything was a function.

Any classes have to be instantiated (generally with a configuration)
before being included in the processor pipeline. The processor then makes
a call to the object, treating it as a function. This works because all
of these included classes are defined with the magic method __call__.

Custom Processors

This idea of processors opens up significant power to customize the
logging pipeline globally. As mentioned, the pipeline is a list of
Python functions.

Beyond tagging enforcement, like adding a timestamp and host info, or
modifying the pipeline based on a deployment target, these custom functions
allow for event sampling–very necessary to avoid data collection overload.

Any processor can prevent an event from continuing through the processing
chain by raising the exception structlog.DropEvent.

Here’s an example with the ever popular FizzBuzz that will drop
every “Fizz”, as well as every other “FizzBuzz”. While just a toy example,
this could be extended to anything – log every delete but only 1 of 100
search queries.

Simple sampling like this is easy, effectively sampling is a hard problem.
I recommend reading Baron Schwartz’s blog post on selecting representative samples
if you’re interested more in sampling.

Migrating to structlog

If I could update all the code in my environment at once, switching to structlog
could be done in an afternoon. However, I have multiple projects with
their own update schedules. I also have Sentry to contend with, which is
tied into the logging library.

Sentry

I found one Github issue
relating to configuring structlog with Sentry, but it ended up confusing
me more than helping. So here’s what did work for me in testing:

Conceptually what’s happening is we configure a logging handler,
attach it to the root logger, and then let structlog fetch the root logger
via the default logger_factory.

External libraries

Without changing everything to structlog at once, I also have to contend
with loggers in imported libraries that have not yet been updated or are
outside my control–otherwise the logs will be half JSON, half traditional
Python messages. That’s a sure guarantee to break any parser reading
these logs.

If I used a renderer other than render_to_log_kwargs, there’s a possibility
the entire JSON will instead be encoded as the value in the message key, like
{"message": "event='Starting' level='info' logger='__main__' timestamp=1535058769.622922"}.
It’s important to double check the format of the output while testing and
that Sentry is correctly grouping the log messages sent.

Now I can gradually migrate over my internal libraries to structlog,
but all of the application’s logs switch to JSON at one time.

Subscribe to updates!

Receive updates on recent posts, projects. Not more than 1-2 emails per month