Book Review and Excerpt: Scalability Rules

Marty Abbott and Michael Fisher's(co-founders of AKF Partners) new book titled Scalability Rules formalizes the technical solutions to scalability problems into a heuristic science. Leveraging their experience from scaling more than 200 Internet sites the authors have compiled fifty simple scalability rules organized under 12 broad categories that can be a handy reference for managers, architects, developers and operations persons alike. The rules address architectural and design concerns but do not delve into the nitty gritty of implementing them with specific technologies. The last chapter provides a much needed risk-benefit model for evaluating these rules for application.

The content of the book is edited and and organized well for both newbies to learn practical scalability rules and for experienced scalability architects to utilize as a primer and reference. The authors have kept it concise(all of 255 pages including table of contents, preface and glossary index) and each chapter ends with a short summary discussion.

The first 12 chapters categorize the rules in 12 broad approaches to scalability that range from KISS principles for design through release, to avoid gold plating and over-engineering, to horizontal scaling, database scaling, asynchronous communication and stateless/state distribution strategies. In these first 12 chapters, each of the 50 rules opens with two line responses to 4 key questions: what, when, how and why, which is handy for future reference without having to skim through all the content.

Another salient feature of the book is the use of cubical graphs(AKF cubes) for denoting a range of scalability solutions measured against typically 3 factors. In Chapter 2: Distribute your work, for example, the AKF scale cube measures scalability with respect to three approaches to scaling: scaling through cloning, scaling by splitting different things (data and services) and scaling by splitting similar things(customer data set). Depending on the scope of scale, an application can include an appropriate mix of these three strategies.

In my opinion a significant contribution from the book is the last chapter which not only provides a risk-benefit model but analyzes the 50 rules against it. This makes the content accessible to even managers who can now understand and evaluate decisions made by architects and developers and how it affects scalability from a cost and impact perspective. A number of technologists leapfrog to clustering, cloud infrastructure or use of other scalable frameworks and infrastructure as low cost (compared to custom building), quick and proven solutions to scalability but the risk-benefit model busts this myth.

The risk benefit model quantifies the approach to scalability by first measuring the risk reduction benefits (R) and the cost impacts (C) for each rule to calculate the priority (P= R - C). Risk is measured as a product of probability of the problem occuring and its impact which implies risk reduction can proxy for benefit. Here is a snippet of the top 5 priority rules:

Rule 19: Relax Temporal Constraints
Rule 25: Make Use of Object Caches
Rule 29: Failing to Design for Rollback is Designing for Failure
Rule 32: Use the Right Type of Database Lock
Rule 46: Be Wary of Scaling through Third Parties

and a sampling of low priority rules:

Rule 13: Design to Leverage the Cloud
Rule 17: Dont Check your work

Lets take Rule 19 for example. This rule simply stated recommends a design principle to avoid enforcing constraints on state of objects between all user actions to maintain consistency at all times. This is best exemplified with e-commerce applications where the time elapsed between the user viewing an item, placing it in his shopping cart and eventually purchasing or cancelling the transaction is considerably large. The cost involved in attempting to maintain consistency of the object's state through the entire process is prohibitively high and is explained through Brewer's CAP theorem. In the case of highly distributed systems such as an e-commerce application availability cannot be comprimised but consistency can be compromised for eventual consistency. In the case of the shopping cart case, the authors recommend not locking the object until it is placed in the shopping cart, which means a possible event could be that a user is denied placing the object in a shopping cart after viewing it since in the iterim another user has purchased the object. This inconvenience for the customer is a reasonable compromise for scalability.

At the other end of the priority spectrum is "Rule 13: Design to Leverage the Cloud" that states cloud infrastructures are specifically meant for handling spiky demand but its risk reduction benefits are low considering the probability and impact of the affected business processes. Batch jobs, test environments and seasonal/promotional offer generated user loads are such sample process candidates for the cloud. The authors rank "Rule 46: Be Wary of Scaling through Third Parties" much higher in priority spectrum since it promotes architectural simplicity and control of destiny and costs which can be violated with the use of vendor products. The risk reduction benefits in this case are much higher than jumping at the first opportunity of utilizing the cloud for scalability.

Similar to cloud providers the onslaught of new and existing persistence solutions has made the decision making process more complex. InfoQ in collaboration with InformIT is sharing "Rule 14- Use Databases Appropriately"from the Scalability Rules book which provides insight to improve your decision making.

About the Book Authors

Martin L. Abbott is an executive with experience running technology and business organizations within Fortune 500 and startup companies. He is a founding partner of AKF Partners, a consulting firm focusing on meeting the technical and business hyper growth needs of today’s fast-paced companies.
Marty was formerly the COO of Quigo, an advertising technology startup acquired by AOL in 2007, where he was responsible for product strategy, product management, technology development, advertising, and publisher services.

Prior to Quigo, Marty spent nearly six years at eBay, most recently as SVP of Technology and CTO and member of the CEO’s executive staff. Prior to eBay, Marty held domestic and international engineering, management, and executive positions at Gateway and Motorola. Marty serves on the boards of directors for OnForce, LodgeNet Interactive (NASD:LNET), and Bullhorn. He sits on a number of advisory boards for universities and public and private companies.

Marty has a BS in computer science from the United States Military Academy, an MS in computer engineering from the University of Florida, is a graduate of the Harvard Business School Executive Education Program, and is pursuing a Doctorate of Management from Case Western Reserve University. His current research investigates the antecedents and effects of conflict within executive teams of startups.

Michael T. Fisher is a veteran software and technology executive with experience in both Fortune 500 and startup companies.“Fish” is a founding partner of AKF Partners, a consulting firm focusing on meeting the technical and business hyper growth needs of today’s fast-paced companies. Michael’s experience includes two years as the chief technology officer of Quigo, a startup Internet advertising company acquired by AOL in 2007. Prior to Quigo, Michael served as vice president of engineering & architecture for PayPal, Inc., an eBay company.

Prior to joining PayPal, Michael spent seven years at General Electric helping to develop the company’s technology strategy and processes. Michael served six years as a captain and pilot in the US Army. He sits on a number of boards of directors and advisory boards for private and nonprofit companies.

Michael has a BS in computer science from the United States Military Academy, an MSIS from Hawaii Pacific University, a Ph.D. in Information Systems from Kennedy-Western University, and an MBA from Case Western Reserve University. Michael is a certified Six Sigma Master Black Belt and is pursuing a Doctorate of Management from Case Western Reserve University. His current research investigates the drivers for the viral growth of digital services.

The excerpt is from the book, ‘Scalability Rules: 50 Principles for Scaling Web Sites’, authored by Marty Abbott and Mike Fisher, published May 2011 by Pearson/Addison-Wesley Professional, Copyright 2011 AKF Consulting Inc. ISBN 0321753887. For more info please visit the publisher site.

I'm a bit surprised that the author discusses RDBMS, NoSQL, Key Value Stores and File Systems, but omits ODBMS. Like RDBMS, ODBMS provide ACID properties, but they are far better at handling complex data models with complex relationships. They are better in that they are faster, easier to maintain and that they don't require implementing/buying/maintaining/using an O/R mapping layer. Unfortunately, the costs of an enterprise ODBMSs are no cheap either. I would have also wished for a mention of graph databases, which are even better at handling relationships.Some example for ODBMS are: