Think you know what scalability is?

"Scalability" is a word often cited in software vendor press releases (and discussed around the water cooler), but is quite often misunderstood. For example, many people infer performance and high availability when they are talking about scalability. Royans K Tharakan, in attempting to answer the question "What is scalability?", says:

Scalability, simply, is about doing what you do in a bigger way. Scaling a web application is all about allowing more people to use your application. If you can’t figure out how to improve performance while scaling out, its okay. And as long as you can scale to handle larger number of users its ok to have multiple single points of failures as well.

Royans notes that today we have two choices when it comes to scaling:

Vertical Scalability - Adding resource within the same logical unit to increase capacity. An example of this would be to add CPUs to an existing server, or expanding storage by adding hard drive on an existing RAID/SAN storage.

Horizontal Scalability - Adding multiple logical units of resources and making them work as a single unit. Most clustering solutions, distributed file systems, load-balancers help you with horizontal scalability.

Architects strive to achieve linear scalability, which refers to the ability to maintain a consistent throughput rate proportionally as resources are added to the system. However, adding resources incurs additional overhead, making it difficult to achieve. Royans refers to this as a "scalability factor", and uses it to enumerates types of scalability:

If the scalability factor stays constant as you scale. This is called linear scalability.

But chances are that some components may not scale as well as others. A scalability factor below 1.0 is called sub-linear scalability.

Though rare, its possible to get better performance (scalability factor) just by adding more components (i/o across multiple disk spindles in a RAID gets better with more spindles). This is called supra-linear scalability.

If the application is not designed for scalability, its possible that things can actually get worse as it scales. This is called negative scalability.

As with a lot of things in software development, there is no one size fits all prescriptive approach that will solve your scalability problems. Royans suggests "If you need scalability, urgently, going vertical is probably going to be the easiest" but warns "Unfortunately Vertical scaling, gets more and more expensive as you grow" and "While infinite horizontal linear scalability is difficult to achieve, infinite vertical scalability is impossible". He goes on to say:

Horizontal scalability, on the other hand doesn’t require you to buy more and more expensive servers. Its meant to be scaled using commodity storage and server solutions. But Horizontal scalability isn’t cheap either. The application has to be built ground up to run on multiple servers as a single application.

Royans ends his piece with advice on tackling scalability across the stack:

For a successful scalable web application, all layers have to scale in equally. Which includes the storage layer (Clustered file systems, s3, etc), the database layer (partitioning, federation), application layer (memcached, scaleout, terracota, tomcat clustering, etc), the web layer, load balancer, firewall, etc. For example if you don’t have a way to implement multiple load balancers to handle your future web traffic load, it doesn’t really matter how much money and effort you put into horizontal scalability of the web layer. Your traffic will be limited to only what your load balancer can push.

This is a nice summary of the classic tension between vertical and horizontal scaling. It only addresses the technical aspect of scaling, however.

This covers scaling your capacity (the maximum throughput you can sustain for a given workload while maintaining acceptable response time). Royans does address the capital costs of "scaling up" versus "scaling out", but he leaves out the operational costs.

With a typical 3-tier architecture, scaling horizontally will produce sub-linear increases in operating costs. As the number of boxes grows, so does the size of the operations group. Because people must be managed, there is additional overhead that produces worse-than-linear increases in operating costs.

Scaling vertically usually doesn't incur these operational costs, but it definitely incurs rapidly increasing capital costs as the boxes get large. (E.g., more than $1M USD for an E25K chassis.)

On a recent blog post, I offer a definition of scalability that attempts to account for all factors: capital costs, operational costs, and even the unseen costs of heavier processes and more extensive monitoring and management systems.

I think the classification that Royans describes can work together with my definition, as there can be several paths to profitable scalability. Royans describes and classifies those paths well.

I'd also add that scalability is only one of a number of core non-functional requirements and the measure of its success is the ability to do it (by whichever means you choose) whilst keeping the others within some pre-agreed reasonable tolerances. Your funky new web site may well scale out to handle an increase in users after adding a bunch of new kit, but they won't be happy bunnies if the response time drops away because you didn't also address a software bottle-neck.

Good to see it being discussed though. Scalability is neglected area in too many organisations.

1. A key component of scalability is to define the performance requirements of the system early in the design process. If it's a simple data entry application that only has 20 concurrent users, and that number is never expected to grow beyond 75, maybe designing for scalability shouldn't consume a ton of your design time up front. It obviously depends on the system, but designing for ultra-high scalability often doesn't add business value.

2. Simple vertical scalability is almost inherent in software design; most applications are going to perform better if a server has a faster CPU, more memory, or a faster I/O bus. The key issue is defining/obtaining pertinent metrics to measure the scalability. For example, how much performance boost do you get by doubling the RAM or increasing the CPU speed by 20%.

3. In my experience, most mid-sized 3-tier systems use a hybrid approach to achieve acceptable scalability. Horizontal scaling of the UI/presentation layer is often easy to accomplish, either through enterprise web development frameworks or because the application is residing on the client computers. And database servers in general respond well to vertical scaling, as long as the application architecture is reasonable. So, the primary scalability decision concerns the middle tier. Do you need to design the middle tier architecture to scale horizontally? Most often, that is THE decision that architects need to make when considering scalability.

Of course, I'm speaking from a mid-to-large business app background, but relatively simple business apps account for a huge percentage of custom development.

For a successful scalable web application, all layers have to scale in equally.

That is very true and that is why Tier Based approach is not scalable since it introduces lots of moving parts, each tier is built as a "silo" from high availability and scalability prespective.

Nati, I hope you don't mind if I disagree, since it's actually the quote from the original article that I'm fundamentally disagreeing with. (It's just that your additional comment is what finally got me to respond. ;-)

Solving scalability problems is often a process of working around the scalability limits of the parts of the infrastructure that do NOT scale equally. In other words, in the real world, one does not have the ability to choose all of the components, applications, system infrastructure, etc. within an organization, because 99% of it is already there and 50% cannot be replaced even within a ten year window. However, the businesses require applications that can both scale and interoperate with these systems.

So a couple of quick points to respond to your comments:

* Tier Based approach is not scalable

I can attest that eBay uses a Tier Based approach. They scale pretty well.

I can attest that Amazon uses a Tier Based approach. They scale pretty well.

In the real world, there are no other approaches. No one is getting rid of their web servers on the front and databases and mainframes on the back and consolidating all of their applications and systems into one giant uber application that does everything from answering the front door to persistently managing its own the data.

* it introduces lots of moving parts

No. Tier based architectures do not introduce moving parts. Tier based architectures simply acknowledge that there are already lots of moving parts, and those moving parts have a lot of responsibilities already, and those responsibilities are not movable, and those tiers have to work together to deliver additional features, support new applications, and otherwise provide "business value".

* each tier is built as a "silo"

No. Silos go up. Tiers go out. You have gotten this all mixed up. Or out. Or mixed in some direction ;-)

I don't mean to point out the obvious, but you aren't really suggesting that people get rid of tiers, but rather that they replace one of their tiers (the Java EE application server middleware that does messaging and processing) with one of your tiers (your proprietary application server middleware that does messaging and processing). Since Java EE is inappropriate for certain types of workloads (such as the master/worker pattern that your proprietary application server middleware focuses on), your suggestion may be perfectly valid. Disguising it as "getting rid of tiers" is disingenuous, though.

Wow... definitions galore, but no actual definition of scaling
by
J. B. Rainsberger

Here we have an author that claims to describe scalability. The author instead waves his (I'm assuming) hands with a nonsense "definition" that says nothing about scalability in concrete terms. "Doing what you do in a bigger way"? Are you kidding?!

The author proceeds to define sub-, supra- and negative; all terms with which I pray the audience is already familiar.

He hints at a definition when he describes the difference between horizontal and vertical scaling.

Still, it would be nice to define scalability at least once. I learned my current definition from Greg Barish's excellent book on high-performance J2EE applications. Scalability is the marginal cost of transaction processing capacity. You measure scalability in dollars per transaction: how much more do I have to spend to be able to process (for example) 1000 more transactions per day?

In a paper entitled "What is scalability?", I would really expect a clear definition at least once.