SDN Prerequisite: Why Network Load Balancing is not the Same as Application Load Balancing

Way back in the early days of the Internet scalability was an issue (the more things change...). One of the answers to this problem was to scale out web servers using a fairly well-proven concept called load balancing. Simply put, distribute the load across web servers to make sure everyone gets served in a timely fashion. We see this in action at stores every day when more checkout lines are added as demand increases. Well, we hope we see this in action. Too often we don't, much to our chagrin.

Anyway, the way in which early load balancing worked was simply to take a couple variables (IP address and TCP port) and then hash them together and stick them in the equivalent of a queue for a web server. Because hash values tend to distribute fairly evenly, this worked well (until we ran into the mega-proxy issue, thanks to folks like Compuserve and AOL).

This is called "network load balancing" because, well, it uses network variables to distribute load. It's quite fast, actually, because it's based on variables that are in fixed locations within a single packet: source or destination IP and TCP port. All the work is on the ingress, on the inbound side, and once the decision has been made it's a pretty simple thing to hash future packets and match it up before sending it on its way. Voila. Network load balancing.

Application load balancing, however, arose because network load balancing was all based on inbound variables. It couldn't take into consideration how loaded the chosen server was, or whether its response time was falling within acceptable business parameters, or whether it was at capacity or not. Those variables were all on the server side, and required visibility into the application, not the client.

It also couldn't account for the fact that virtual servers were popping up everywhere (multiple applications served from the same IP address and port) and forced the web server to become a load balancer itself. Which, if you think about it, was kind of crazy. If a single server couldn't scale well enough to meet demand, how is putting a single server in front of them going to help the situation?

Application load balancing (which has also been given other fancy names over the years like content switching or routing, application switching, application or page routing, etc...) is really focused on distributing load across applications intelligently. While it can use ingress variables like IP address and port, it generally doesn't because that doesn't offer the insight into which server (application, web, virtual, whatever) is going to be able to respond (has capacity) in a time frame acceptable to the business (response time) for a specific application (or piece of the application like images).

The difference between the two lies primarily in the variables used to distribute load. Network load balancing relies solely on network variables while Application load balancing relies mainly on application variables.

This change in load balancing techniques opened up all sorts of new efficiencies and scalability options because it allowed architectures to specialize - route requests for images to servers focused on serving images, requests for static content to servers focused on serving static content, etc...). It also enabled persistence (sticky sessions) which greatly accelerated the ability to scale out stateful applications in a web format.

Why Is It Important to SDN?

The reason this is important to SDN architectures is because layer 3 switches can, in fact, support network load balancing. Fairly easily, in fact. If you look at how Link Aggregation (trunking) is implemented in most switches, you'll see it's using network load balancing techniques to distribute load across trunked links and that the algorithms used are pretty much the same ones we used back in the day to load balancing servers based on network variables. The hash is pretty simple (and easily implemented) and doesn't require storing state because the hash is always based on the same variables, easily extracted from IP and TCP headers, and don't really tax the system. Forwarding tables are basically sets of inbound IP addresses, TCP ports and (switch) ports matched to outbound IP addresses, TCP ports and (switch) ports. So you can see that network load balancing wouldn't overly tax a controller (it just has to hash the right values and insert a forwarding entry) or a switch.

But it wouldn't be application centric, or be able to take into consideration things that modern load balancing services care about - like application status, connection capacity, and response times, not to mention enabling specialization of services. But in order to be application centric application load balancing must participate in the data path and have visibility into variables that aren't available in packets - they're in payloads and in the application server (instances) itself. Like the implications of being stateful versus stateless, the burden on a centralized controller would be overwhelming.

Thus while SDN principles are certainly applicable, the same architecture used to implement SDN for lower order network layer services is not going to be the same architecture used to implement SDN for higher order network layer services. When evaluating SDN solutions, it's again important to consider how any two SDN network (core and application) architectures complement one another, integrate with one another, and collaborate to enable a complete software-defined network architecture that supports the unique needs of both layer 2-3 and layer 4-7.