Nodejs Logging with Bunyan and Azure

In this article you will learn how to install bunyan, store your logs in azure tables and perform log queries.

A while ago we started a mobile app development and we decided to go for a full Html, Css and Js stack. We decided to host the nodejs backend on Azure, since it has pretty good Nodejs support. We needed a good logger which can take advantage of azure storage.

Bunyan

Why Bunyan?

There are two well known nodejs loggers which could do the job :

Winston: This library is a component from Flatiron a larger Nodejs framework. Today, it is probably the most popular option.

Bunyan: This library writes logs in JSON. It ensures logs can easily be processed either by human or machine.

I won't detail differences here, but you can read this article for a good comparison.

Both libraries support redirection of specific log levels to different storage options. For instance, you can persist 'error' logs in a database and write other log levels in files with log rotation. They are both extensible too. Two reasons led me to choose Bunyan over Winstom:

Serializers: Serializers happen to be Bunyan's way of dealing with circular reference. But it also allows you to control how common objects such as express req or res will be serialized. It makes logging quite easy.

Json serialization: Bunyan serializes logs to JSON by default. It makes automated processing very easy. It's also quite convenient to write tools around this for your backoffice such as pretty printing, search, etc...

Let's take a look at a sample using those features:

log.error({ err: err, req: req, correlationId: correlationId });

It outputs every important information I might need to troubleshoot the issue such as cookies or headers:

Setup Bunyan

First steps

Let's install Bunyan. First run the following npm command :

npm install bunyan --save

Then create a logger. You can create several loggers for different usages. For instance, you might use one logger to log some health information and another to log errors, warnings and infos. But let's stick to a simple use case:

Storing logs

Now we have a better logging system than console.log, let's see what we're going to do with our logs. As I said we are targeting Azure web sites. Its storage scales up to 50 GB storage. Currently, there is no options to extend this space beyond that limit.

Local storage

But you have to keep in mind that this storage is limited. Without a proper eviction strategy, you risk filling this space with logs. Moreover, you might need this space for other needs, such as caching or static data.

Why azure storage?

It might seem obvious, but I'm not storing logs for nothing. I am planning on using that data. So here is my two basic requirements:

I want a way to easily gather logs associated with an error that a user has reported.

I need a solution that scales enough to be eligible for production.

When I report an error to my API consumers, I provide very limited information. This way I ensure no sensitive information is leaked:

As you can see the error payload includes a correlationId. This is the Id I want to use to query my logs.

Now that we have defined what we want to achieve, we can take a look at Azure storage. It comes with tables, blobs and queues. Each component has been built with high availability, scalability and throughput in mind. Let's focus on the first two.

Azure Tables are key/entity storage with very high throughput. It costs less than $5 / 50 Go / Month. When storing data in azure tables you need to define a two part key and provide an entity which is basically a set of columns. As we said the key is composed of two parts:

Partition key : This part of the key will be used to span data between azure servers. Planning how this key is built is very important because it defines your ability to scale while still querying data efficiently. Every entity sharing the same key will be stored on the same physical server. So basically, if you know the right partition key, you can make sure a query only runs on one server and returns quickly.

Row key : It is the unique id of your entity.

On Azure blobs you can store regular files. It costs less than $3 / 50 Go / Month. Is has both pessimistic and optimistic concurrency handling.

Both components might prove very efficient at storing logs, but from the query perspective, Azure table beats azure blobs. In fact, Azure blobs might be more interesting for long term retention and analysis (for instance for statistics or anomaly detection). So let's go for Azure tables.

Introducing az-bunyan

Bunyan has no support for Azure out of the box, but has a simple extensibility model. So basically what I need is a custom Bunyan stream which outputs my logs to an azure table. There was already bunyan-azure a npm package doing that, but it had some issues and it appears to be unmaintained. This is why I decided to write a new module az-bunyan.

By default it will use the following partition key format loggername_YYYY-MM-DD-HH and it will use a new uuid as the row key. You can override this behavior to whatever you need. For instance, in my case, I want the RowKey to start with the correlationId.

Search for a specific log

To query the logs you can obviously build some features in your back office. But in the meantime, you can take advantage of Azure storage explorer to do so.

It has a builtin query tool:

You should filter on the partition key to reduce query execution time. You can also search in more than only one partition at a time using 'starts with'. This allows you to perform ranged queries over several partitions.

Conclusion

Now you are ready to add bunyan to your nodejs projects and store your log in azure tables. You can download the sample here. Just run npm install and node . and you're all set.

I'm planning to add new features to az-bunyan such as blobs support or log rotation. If you have use cases or questions I will be happy to read your comments below. Feel free to submit issues on my github page as well! May code be with you.