Elasticsearch and Kibana on AWS

Today I am going to show you how to design and implement a cloud based, real-time, log storage and analytics system. Why? Because a good, scalable log analytics system can identify bugs / latency problems for your platform and also provide a foundation for personalization features in your apps. Also, the architecture is designed with multiple uses in mind so you can also run other search services on top of same general system. For example, these are a few questions you will be able to answer with a real-time log analytics framework:

How many / which requests generated a 404 error?

How many / which requests generated a 500 error?

How many / which requests took more than 2 seconds to complete?

Which requests sent the most bytes back to the users?

What requests does the average user make when on the system? now we can replicate those request in a script and test for 10K users, 50K users, 4M users etc. and see how the system responds.

Recommendations: Users who made these requests also made these requests (this can be anything like viewing a book, favorited a recipe, bookmarking an equation etc.)

Below is a diagram of a production system that I built and have been running.

Elasticsearch

We use an Elasticsearch as our distributed search service and Kibana as our User Interface to Elasticsearch for ad hoc analysis of the logs and saved analysis in the form of dashboards.
The first thing you will want to do is set up your Elasticsearch service. We use Amazon’s ES Service to provision and manage our resources. Go ahead and create a cluster. This will also install the Kibana plugin on your cluster. After your cluster is up and running you will have something called a search domain. Your search domain has 3 very important pieces of information:

Endpoint: Where you make HTTP request to the search cluster search-yourSearchDomainName-somerandomhash.us-west-2.es.amazonaws.com

Kibana Url: Where you point your browser for Kibana search-yourSearchDomainName-somerandomhash.us-west-2.es.amazonaws.com/_plugin/kibana

ARN: The ID of your search domain for use with other services like IAM arn:aws:es:us-west-2:(YourAWSId):domain/yourSearchDomainName

When your search cluster is first setup, modify your access policy and make it wide open so you can test your configuration. You cannot just hit the Elasticsearch endpoint with curl from a machine in your vpc and have it work (like RDS). If the search cluster is not wide open you will have to either sign your HTTP request (which I will show you how to do) or proxy through an authorized IP address (which I will also show you how to do). But first, let’s just test the basics so open it up.
Once you have opened up access to everyone go ahead and use curl to test your basic Elasticsearch setup and click on the Kibana link to make sure it is working. Play around with both adding some test entries into your cluster and querying them with Kibana.

Nginx Reverse Proxy

The next step is to set up a reverse HTTP proxy so we can give secure web access to our protected Kibana resource via IAM. First let us create the reverse proxy, test it and then I'll show you how to secure it via IAM.
We will use Nginx as a reverse proxy.
You will need a running EC2 instance using any flavour of Linux you prefer. I use Amazon Linux. The details of installation and the exact location of the config files will depend on the Linux distribution you choose. However, nginx installation and config file locations are straightforward on all distributions.

Once you have nginx installed, open up port 443 on your EC2 instance using the AWS console, and make sure you have ssl certificates. You could do this over port 80 without ssl certs however I highly discourage this approach for a production system.
You need two nginx config files: nginx.conf and kibana.conf mine are located at /etc/nginx

Note that kibana.conf references your server name "myproxy.example.com" . That is the address you will use to access Kibana. Here you can use an Elastic IP address and route 53 to give a meaningful name to your instance. You will need this IP address to secure via IAM in the step below. Also, if your instance fails you can just spin up another EC2 instance and re-assign the elastic IP without much trouble. Also, notice that the last two lines enable basic authentication. This will protect your proxy with a username and password. To create a username and password use the htpasswd tool

You should now have a secure proxy! It is accessible only via HTTPS and is secured via Basic Auth. Next we need to restrict access to our search cluster and Kibana.

Identity and Access Management (IAM)

We use IAM to restrict access to our search cluster to signed requests and specific IP addresses. So we create two IAM policies with the declaration below. We have the IP restriction in place so that only our reverse proxy can interface with Kibana and the cluster. Then we have a user policy so that we can make HTTP calls to our cluster via code if we sign our request appropriately (I will show you how to sign requests in Java).
Attach this IAM policy to your search cluster by using the “Modify access policy” action on the AWS console. It takes a while for the policy to update. Once it is updated, make sure your Kibana plugin is still accessible via your reverse proxy. Also, you will no longer be able to make a curl HTTP call to your search cluster from anywhere.

To make HTTP calls to your search cluster now, you will need to sign your request with the credentials of the IAM user you used above. Below is an example of how to do it in Java. I have five Java files below. Two are from the Amazon Java SDK and the rest are convenience classes to sign requests and access the search cluster. Feel free to use as you see fit, the code I wrote is under an Apache License for com.sakkaris, I will have a maven repo one of these days. Just add all these files into a package and add the “package” statement to the top of the code. For example, I have all these files in a “utils/aws” package. The files are:

AmazoneOptions: a class I wrote to build HTTP options for the requests

AmazonSignedRequestClient: a class I wrote to make signed HTTP calls to AWS

AmazonElasticsearchRequestClient: a class I wrote to make AWS Elaticsearch calls

HttpRequestFactory: copy this file from the AWS Java SDK into your package because it is not a public class in the SDK. Apache License.

RepeatableInputStreamRequestEntity: copy this file from the AWS Java SDK into your package because it is not a public class in the SDK. Apache License.

RepeatableInputStreamRequestEntity.java

import com.amazonaws.Request;
import com.amazonaws.http.AmazonHttpClient;
import com.amazonaws.metrics.MetricInputStreamEntity;
import com.amazonaws.metrics.ServiceMetricType;
import com.amazonaws.metrics.ThroughputMetricType;
import com.amazonaws.metrics.internal.ServiceMetricTypeGuesser;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.http.entity.BasicHttpEntity;
import org.apache.http.entity.InputStreamEntity;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
/**
* Custom implementation of {RequestEntity} that delegates to an
* {InputStreamRequestEntity}, with the one notable difference, that if
* the underlying InputStream supports being reset, this RequestEntity will
* report that it is repeatable and will reset the stream on all subsequent
* attempts to write out the request.
*/
class RepeatableInputStreamRequestEntity extends BasicHttpEntity {
/** True if the request entity hasn't been written out yet */
private boolean firstAttempt = true;
/** The underlying InputStreamEntity being delegated to */
private InputStreamEntity inputStreamRequestEntity;
/** The InputStream containing the content to write out */
private InputStream content;
/** Shared logger for more debugging information */
private static final Log log = LogFactory.getLog(AmazonHttpClient.class);
/**
* Record the original exception if we do attempt a retry, so that if the
* retry fails, we can report the original exception. Otherwise, we're most
* likely masking the real exception with an error about not being able to
* reset far enough back in the input stream.
*/
private IOException originalException;
/**
* Creates a new RepeatableInputStreamRequestEntity using the information
* from the specified request. If the input stream containing the request's
* contents is repeatable, then this RequestEntity will report as being
* repeatable.
*
* @param request
* The details of the request being written out (content type,
* content length, and content).
*/
RepeatableInputStreamRequestEntity(final Request> request) {
setChunked(false);
/*
* If we don't specify a content length when we instantiate our
* InputStreamRequestEntity, then HttpClient will attempt to
* buffer the entire stream contents into memory to determine
* the content length.
*
* TODO: It'd be nice to have easier access to content length and
* content type from the request, instead of having to look
* directly into the headers.
*/
long contentLength = -1;
try {
String contentLengthString = request.getHeaders().get("Content-Length");
if (contentLengthString != null) {
contentLength = Long.parseLong(contentLengthString);
}
} catch (NumberFormatException nfe) {
log.warn("Unable to parse content length from request. " +
"Buffering contents in memory.");
}
String contentType = request.getHeaders().get("Content-Type");
ThroughputMetricType type = ServiceMetricTypeGuesser
.guessThroughputMetricType(request,
ServiceMetricType.UPLOAD_THROUGHPUT_NAME_SUFFIX,
ServiceMetricType.UPLOAD_BYTE_COUNT_NAME_SUFFIX);
if (type == null) {
inputStreamRequestEntity =
new InputStreamEntity(request.getContent(), contentLength);
} else {
inputStreamRequestEntity =
new MetricInputStreamEntity(type, request.getContent(), contentLength);
}
inputStreamRequestEntity.setContentType(contentType);
content = request.getContent();
setContent(content);
setContentType(contentType);
setContentLength(contentLength);
}
@Override
public boolean isChunked() {
return false;
}
/**
* Returns true if the underlying InputStream supports marking/reseting or
* if the underlying InputStreamRequestEntity is repeatable (i.e. its
* content length has been set to
* {@link InputStreamRequestEntity#CONTENT_LENGTH_AUTO} and therefore its
* entire contents will be buffered in memory and can be repeated).
*
* @see org.apache.commons.httpclient.methods.RequestEntity#isRepeatable()
*/
@Override
public boolean isRepeatable() {
return content.markSupported() || inputStreamRequestEntity.isRepeatable();
}
/**
* Resets the underlying InputStream if this isn't the first attempt to
* write out the request, otherwise simply delegates to
* InputStreamRequestEntity to write out the data.
* <p>
* If an error is encountered the first time we try to write the request
* entity, we remember the original exception, and report that as the root
* cause if we continue to encounter errors, rather than masking the
* original error.
*
* @see org.apache.commons.httpclient.methods.RequestEntity#writeRequest(OutputStream)
*/
@Override
public void writeTo(OutputStream output) throws IOException {
try {
if (!firstAttempt && isRepeatable()) content.reset();
firstAttempt = false;
inputStreamRequestEntity.writeTo(output);
} catch (IOException ioe) {
if (originalException == null) originalException = ioe;
throw originalException;
}
}
}

Below is an example java main method to make requests to your search cluster. Just replace the searchEndpoint below with your cluster's endpoint. You must have your credentials configured on your computer for this to work. It is pretty straightforward, the details can be found here: AWS Java Credentials

Load Balancer / S3 / Glacier

Ok so the hardest part is over! Also you have a couple nifty little Java tools to interact with your cloud platform. In this next part we will deal with publishing the access logs from your app server fleet to S3 / Glacier. Luckily this is very easily done with the Amazon Console.

We use a load balancer to distribute web request made to our platform to a cluster of different app servers running our code. The load balancer will record the exact HTTP request and which server the request was routed to and save that in a one-line log file entry in the access log. Multiple access log files will be generated for multiple app servers and all the log files get pushed to S3. We use S3 to temporarily store our access log files and to generate events that access log file have been created (how temporary is configurable, could be 1 day or 1 year, whatever you like). After we are done processing our logs we back them up to Glacier for cheaper archival storage.

Lambda

Amazon Lambda (Lambda) is the last piece of the puzzle. We use lambda to automatically launch computational resources to handle our S3 events and process the access log files. I will give you example code that you are free to use to process S3 Events and index them to Elasticsearch. Go ahead and set up Lambdas through the AWS Console. Notice that you need to upload a zip or jar file. The best way to do it is to package your lambdas with any dependencies as a jar, upload to S3 and then point to the S3 url through the console. One important caveat is that any dependencies have to be flattened out in you jar. Luckily there is a Maven plugin that can accomplish this.

Dependencies: you need the example code above along with the following maven dependencies along with the code we covered above.

I wrote some helper code to parse a log line into a Java object and then marshall it to JSON for indexing. Feel free to use, you can modify to use your own request parameters but this will work out of the box for Elastic Load Balancer access logs.

Your are ready to go! Now do "mvn package", upload your jar to S3 and configure your lambda to use the jar.
You still have a couple small details such as configuring your lambda to listen to the correct S3 bucket, but you
can do that through the console and the lambda documentation walks you through it.

Below is an example of what you will see on Kibana. This shows us all requests that took more than 2.1 seconds to complete this week. Have fun!