December 22, 2011

Before Amazon introduced the Elastic Load Balancing (ELB) service, the only way to do load balancing in EC2 was to use one of the software-based solutions such as HAProxy or Pound.

Having just one EC2 instance running a software-based load balancer would obviously be a single point of failure, so a popular technique was to do DNS Round-Robin and have the domain name corresponding to your Web site point to several IP addresses via separate A records. Each IP address would be an Elastic IP associated to an EC2 instance running the load balancer software. This was still not perfect, because if one of these instances would go down, users pointed to that instance via DNS Round-Robin would still get an error until another instance would be launched.

Another issue that comes up all the time in the context of load balancing is SSL termination. Ideally you would like the load balancer to act as an SSL end-point, in order to offload the SSL computations from your Web servers, and also for easier management of the SSL certificates. HAProxy does not support SSL termination, but Pound does (note: that you can still pass SSL traffic through HAProxy by using its TCP mode, you just cannot terminate SSL traffic there.)

In short, if Elastic Load Balancing weren’t available, you could still cobble together a load balancing solution in EC2. There is no reason to ‘roll your own’ anymore however now that you can use the ELB service. Note that HAProxy is still the king of load balancers when it comes to the different algorithms you can use (and to a myriad of other features), so if you want the best of both worlds, you can have an ELB upfront, pointing to one or more EC2 instances running HAProxy, which in turn delegate traffic to your Web server farm.

Elastic Load Balancing and the DNS Root Domain

One other issue that comes up all the time is that an ELB is only available as a CNAME (this is due to the fact that Amazon needs to scale the ELB service in the background depending on the traffic that hits it, so they cannot simply provide an IP address). A CNAME is fine if you want to load balance traffic to www.yourdomain.com, since that name can be mapped to a CNAME. However, the root or apex of your DNS zone, yourdomain.com, can only be mapped to an A record, so for yourdomain.com you could not use an ELB in theory. In practice, however, there are DNS providers that allow you to specify an alias for your root domain (I know Dynect does this, and Amazon’s own Route 53 DNS service).

Elastic Load Balancing and SSL

The AWS console makes it easy to associate an SSL certificate with an ELB instance, at ELB creation time. You do need to add an SSL line to the HTTP protocol table when you create the ELB. Note that even though you terminate the SSL traffic at the ELB, you have a choice of using either unencrypted HTTP traffic or encrypted SSL traffic between the ELB and the Web servers behind it. If you want to offload the SSL processing from your Web servers, you can choose HTTP between the ELB and the Web server instances.

If however you want to associate an existing ELB instance with a different SSL certificate (say for instance you initially associated it with a self-signed SSL cert, and now you want to use a real SSL cert), you can’t do that with the AWS console anymore. You need to use command-line tools. Here’s how.

Before you install the command-line tools, a caveat: you need Java 1.6. If you use Java 1.5 you will most likely get errors such as java.lang.NoClassDefFoundError when trying to run the tools.

That's it! At this point, the SSL certificate for stage.mysite.com will be associated with the ELB instance handling HTTP and SSL traffic for stage.mysite.com. Not rocket science, but not trivial to put together all these bits of information either.

4 comments
:

If you put an ELB in front of your load balancer instance you can use AutoScaling to mitigate the SPOF.

AutoScaling groups can take advantage of the ELB health checks. Doing this with allow AutoScaling to replace your load balancer instance if it fails to serve requests.

Make an AS group of 1 instance and set it to use ELB health checks. If your load balancer stops working for any reason AS will terminate it and replace it.

For this to work you need to either have an AMI for your load balancer or have user-data you can pass in via the AS launch configuration which will cause the new instance to configure itself as a load balancer. Puppet, chef, etc are all good solutions for this. Also, set the ELB health check type to HTTP and turn the intervals down.

Another point is that when you terminate SSL at the ELB you'll have two additional HTTP headers added to the request. One is the expected X-Forwarded-For which will have the remote client's IP (your software load balancer should also prepend the ELB IP to this list.) The other is X-Forwarded-Proto, which your web servers can use to detect if the client connection is secure. (As a side note, Tomcat has a RemoteIP valve which consumes these, and a similar remoteip module is coming in a future version of apache httpd.)

Finally, one thing ELB does not do is caching static content, and one software load balancer which does that didn't get mentioned in your article, Varnish Cache. I've had great success putting an ELB in front of Varnish. The ELB only serves two purposes: SSL and AS health checks. There's only one instance behind the ELB which is my varnish server. Varnish then balances between my web app servers and also caches static content. This combo is so full of win I highly recommend it.

I'm not suggesting replacing an ELB! I'm just taking the idea in the article and adding AutoScaling to the mix. I'm suggesting combining a software load balancer and an ELB with an AutoScaling group.

In fact, the strategy could work with any software load balancer. HAProxy, Pound, Varnish, it doesn't matter. I just mentioned varnish as one more example, of many.

The ELB sits at the very front, and behind it is a single EC2 instance running a software load balancer, just like in the original post, but the ec2 instance is managed by AutoScaling (in a group of 1.)

Without AutoScaling, as was described in the article, the sofware load balancer instance becomes a SPOF (Single point of failure.)

AutoScaling + ELB mitigates that risk, because if the software load balancer stops serving requests for any reason, the ELB will detect that and trigger AutoScaling to destroy & replace it.