Setup ELK for NGINX logs with Elasticsearch, Logstash, and Kibana

ELK Elastic stack is a popular open-source solution for analyzing weblogs. In this blog post, I describe how to setup Elasticsearch, Logstash and Kibana on a barebones VPS to analyze NGINX access logs. I don’t dwell on details but instead focus on things you need to get up and running with ELK-powered log analysis quickly.

Comparing to other tools available ELK gives you extreme flexibility in terms of ways to analyze and present your logs data. Hosted solutions are a bit pricey with monthly costs starting around $50 for a reasonable features set. By following this tutorial you can setup your own log analysis machine for a cost of a simple VPS server. No need to be a dev-ops pro to do it yourself.

ELK stack will reside on a server separate from your application. NGINX logs will be sent to it via an SSL protected connection using Filebeat. We will also setup GeoIP data and Let’s Encrypt certificate for Kibana dashboard access.

This step by step tutorial covers the 6.2.4 version of the ELK stack components.

Just to show you a sneak peak of what we will be building in this tutorial series:

Kibana charts generated from NGINX access logs of this blog

Let’s get started.

Setup VPS

You need to start with purchasing a barebones VPS and adding SSH access to it. I don’t elaborate on how to do it in this tutorial.

You will also need a domain or a subdomain you will config with your VPS server IP using an A DNS entry. You can check out my other blog post for tips on how to save money on domain names. If you use Cloudflare for your DNS remember not to use their CDN for this domain because it changes IP domain resolves to and can cause trouble with setup.

For my ELK stack server, I use a 4GB Digital Ocean VPS with Ubuntu 16.04. It is running Elasticsearch, Kibana and Logstash processes. With my current amount of traffic log data 4GB RAM is enough so far.

Install ELK dependencies

Access your VPS and run the following commands as a sudo user to install required dependencies:

Java

Java is required for both Elasticsearch and Logstash. Make sure to install Java 8. At the time of writing Logstash is not yet compatible with the newest Java version.

Just like in the case of Elasticsearch you can verify that it is running by using a cURL command:

curl -v http://localhost:5601

Now let’s expose an access to our Kibana dashboard to an external world using NGINX.

NGINX

This is not the NGINX we will be analyzing logs from. This one will be used to provide password-protected access to Kibana instance running on our ELK server. We will use a Let’s Encrypt SSL certificate for secure access. We can do it by typing:

Remember to substitute my-elk-stack-vps.com with your domain name in the command generating a self-signed certificate. Later we will have to copy resulting files to your client-server Filebeat configuration.

Configure GeoIP data

In order to map visitor IP adresses to geographical locations we need to download GeoIP database:

This config specifies input and output for out logs and how they will be formatted before sending them to Elasticsearch. GeoIP data is configured here as well. It also enforces a secure SSL connection signed by a correct certificate for logs sent by a Filebeat.

Now let’s start Logstash process and verify that it is listening on a correct port:

If it does not work, you can check out the troubleshooting guide at the end of the post.

Filebeat

Fielbeat is the only part of the infrastructure that needs to be installed on a client server. You should login to the server of your NGINX application and copy the self-signed SSL certificate files to the correct folder:

/etc/elk-certs/elk-ssl.crt
/etc/elk-certs/elk-ssl.key

You can use SCP to do it or just copy/paste the contents of files.

Now, install Java using the same commands as for the main ELK host server. Then install rest of the dependencies:

This config tells Filebeat where to send our logs and which SSL certificates to use for authentication. paths option points to a default NGINX logs folder.

If you have an NGINX running for a while, you probably have a bunch of GZipped logs in /var/log/nginx/. To send them to Kibana you should unzip them using gunzip and change their resulting filenames to match the *.log wildcard expression.

Raw logs are here

If everything went fine you should go to Kibana dashboard and create an index pattern called weblogs-*. You can do it in a Management menu tab. Now you can go to Discover and see your raw logs data there:

Raw NGINX access logs in Kibana dashboard

This how a raw JSON entry stored in Elasticsearch for a single NGINX log event after being parsed by Logstash looks like:

Troubleshooting

As you can see you need to make various components play together in order to get the ELK stack running. Here’s a list of commands which can help you debug when things go wrong:

Filebeat logs:

tail-f /var/log/filebeat/filebeat

Logstash logs:

tail-f /var/log/logstash/logstash-plain.log

Start a Filebeat process in the foreground to see if it can connect to Logstash on the host ELK server:

/usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -e-v

Start a Logstash process in the foreground to check why it’s not listening on a port:

/usr/share/logstash/bin/logstash --debug

Summary

I am just gettings started to play with ELK Elastic stack and discovering options it has to offer. I hope that this tutorial will help you get up and running with it quickly even if you don’t have much dev ops experience up your sleeve.