5 Steps to Better Application Logging

Published
February 28, 2013

Logs are a source of time-ordered events about everything happening with your app. But their inconsistent verbosity and substance obscures the big-picture view. What if you could easily and automatically roll them up into daily charts, or run ad-hoc queries to look for correlations on user behavior?

Then we see that logs are far more than ephemeral output or a simple debugging aid. Logs are data that provide incredible operational and business visibility.

The possibilities

A well-structured, relevant, event stream lays the foundation for higher levels of abstraction, aggregation and visualization. Consider the following uses for your log data:

Use a service like Papertrail to search logs, issue alerts when certain conditions are met, and put logs on S3 to mine with an Elastic Map Reduce job.

Piping logs of a known format to your own metrics analysis service to calculate time-based statistical averages which are then sent to Librato for visualization on a team dashboard.

In their default form, it's not likely that your application logs can power services such as these or give you meaningful insight. Here are 5 steps to change that.

Choose core set of events

Logs are not excreted by your application. They are deliberately created by you, the developer, to provide insight. Don't obfuscate this channel by logging everything you can access. Instead, reduce your logs to their essence by sequestering log data along two axes: operational events and business events.

Operational data indicates the general health and performance of your application. What are the three most crucial metrics for knowing if your application is functioning properly? Include these measurements, such as response times, requests per second and queue length, in your log stream.

The other log axis, business data, relates to your application’s efficacy in serving the business and its users. Apply the same exercise: Choose the three questions you must answer to know if your application is serving your business. Keep the vocabulary at the user level to prevent regressions to the operational aspects of the app.

By way of example, a log management business like Papertrail must, authoritatively, know:

Is log processing behind?

How far behind is the processing queue?

How many customers are affected?

Focus on your operational and business needs to identify a small set of metrics. From this initial dataset you can expose the metrics that are of unique value to you and your business.

Configure log output

Application logs should be streamed, unbuffered, to STDOUT and not stored in files on the local system. These event streams, when piped to other services, can be collated across sources into a single, sequential system-wide view. Determine the necessary settings for your language, framework or library to output logs to STDOUT.

In such streamed environments, contain logical messages to a single line to avoid interspersion with other time-ordered messages. Consider two simultaneous requests to a Rails app running on two compute instances that are interleaved to a single output:

Format as key-value tokens

Historically, logs have been targeted for human eyes. But logs can serve more than one master. Log management software (see: Fluentd and Scribe) and services (see: Papertrail and Splunk) receive log streams for higher order processing and analysis and can be attached to most modern platforms like Heroku or any infrastructure with access to syslog-ng.

Human readability and machine parsability are not mutually exclusive concerns. Serve both by choosing a structured log format that contains as little decoration as possible. A simple key=value format works well for its easy command line scriptability and token extraction. Also, its implementation is trivial and independent of the implementation language:

logger.info("measure.queue.backlog="+queueBacklog);

Target both humans and machines when creating your log statements with a simple, structured format. This ensures the logs' viability and longevity.

Use consistent log keys

Naming is hard. This is true of many things, including log keys. Using name-spaced keys within a consistent key-set simplifies the instrumentation of logs and enables tools that consume these logs.

Consider the following core keys for your log data:

Key

Meaning

Example

measure.*

the name-spaced label of the data-point, prefixed by measure.

measure.session.loginmeasure.queue.backlog=23

source

the optional source of the log -- for use in multi-component and distributed systems

source=instance-1041

Using these conventions clearly identifies the primary measurement of each log statement and establishes a known key-set that other tools can interrogate.

Logs using this convention can be piped to a service like l2met which will calculate a set of time-based functions on the data and send it to Librato for visualization.

Include contextual data

This is not to say that log data is limited to this key-set. Related and relevant event information should be included to support querying, simple reporting and historical analysis.

Include the directly participating entities for context and namespace them appropriately (user- and article-) to enforce consistency across logs. When logging its article publishing activity the Heroku Dev Center includes the following data:

Strike a balance between machine and human readability by including both uniquely identifiable machine ids (user-id, article-id) and human ids (editor@heroku.com and Dynos and the Dyno Manifold).

Also consider the inclusion of a session or request id on every log statement to make it easier to identify the path of specific entities. This key should represent as long of an unbroken session as you can identify while still offering enough granularity to be useful as a debugging tool.

A simple implementation could be the pairing of values that are unlikely to change, such as the user's system id/username, with a more time-constrained value like the current date.

Logging libraries like Scrolls allow you to set contexts, which are ideal for session keys.

Logs data is unique to every application. However, be disciplined in your logging by including a minimal but consistent set of supporting information.

Conclusion

Logs are a loosely coupled source of rich application data. Leverage this datasource with a purposeful approach to logging and gain deep insight into your app's performance profile and user-behavior. Experiment with logging and log-management services to find the right level of visibility for your needs.

Ryan Daigle
works at Heroku, where he attempts to expose modern technology practices through journalism. This article was written with invaluable contributions from Troy Davis and numerous Heroku engineers.

Miyagi is a journal of high-quality, general-purpose technical content. Its contributors
are thoughtful practitioners of modern application development with a variety of interests.