Inconsistent Amazon Elastic Load Balancing (ELB)

ELB on Amazon AWS has by far been one of the most frustrating of all the services that we have used on the Amazon AWS platform for the Unbxd Search cloud. Here is what we repeatedly see with ELB –

1. Does not balance the load equally over it’s children EC2 instances.

2. Sometimes it appears to do so and we believe that over time it will achieve equal distribution in the round-robin configuration. However, it simply does not happen.

3. Time and again, we have issues with search requests not responding quickly and turns out – Amazon AWS has screwed up by transferring load to only 1 single EC2 instance and the remaining 3 EC2 instances are starved (yes, 3 are starved!!!)

4. Documentation and other posts on the internet suggest that the load balancing happens on EC2 instances equally over the availability regions. We tried to benchmark that with the number of requests and it’s absolutely inconsistent.

5. There is nothing in the API also which we could use to customize such a behavior – adding weights is a ‘not-even-thinking-about-it’ feature!

Here’s how we have managed to get around this problem with a crude brute force solution – HAProxy on all instances. Nevertheless, it has actually *solved* our problem of achieving equal load distribution across EC2 instances.

1. Direct all ELB traffic to an HAProxy port on all EC2 instances. Now you can also secure such calls
2. HAProxy installed on all EC2 instances re-direct calls in a round robin way to all other EC2 instances over the application port. Even if ELB screws up, we make up for it because of HAProxy
3. Remember that all our search request calls are non-sticky calls