Why your cloud is speeding for a scalability cliff

A startup scales up to no avail

Towards the end of 2012 I worked with an internet startup in the online education space. Their web application was not unusual, built in PHP and using Linux, Apache & Mysql all running on Amazon web services. They had three webservers in the mix and were seeing 1000 simultaneous users during peak traffic.

All this sounds normal except they were hitting major stalls, and app slowdowns. Before I was brought in they had scaled their MySQL server from a large to extra large instance, but were still seeing slow downs. What can we do, they asked?

I dug in and took at look at the server variables. They seemed to have substantial memory allocated to the server and Innodb. I then dug into the slow query log. This is a great facility in MySQL which sifts through activity happening against your database, and logs those which take a long time. In this case we had it set to ½ second and found tons of activity.

What was happening? Turns out there were lots of missing indexes, and badly written SQL queries.

How can we resolve these problems?

The customer asked me to explain the situation. I asked them to imagine finding a friend’s apartment in NYC without an address. Not easy right? You have to visit all of it’s 8 million residents until you locate your friend’s home.

This is what you’re asking the database to do without indexes. It’s very serious. It’s even compounded when you have hundreds or thousands of other users hitting different pages all with the same problems. Your whole dataset can fit in memory you tell me? So-called logical I/Os still cost, and can indeed cost dearly. What’s more sorting, joining, and grouping all compound the amount of memory your dataset can require.

High performance code isn’t automatic

We have automation, we have agile processes, we can scale web, cache and search servers with ease. The danger is in thinking that deploying in the cloud will magically deliver scalability. Another danger is thinking that ORMs like ActiveRecord in Ruby or Hibernate in Java will solve these problems. Yes they are great tools to speed up prototyping, but we become dependent on them, and they are difficult to rip out later.

People think art and architecture are overly esoteric, and that is a fair
argument but I could see these type of vocabularies being used effectively
for something like airbnb, maybe in conjunction with existing indexing systems.…

The other issue is if as such an indexing system evolves it needs to be fairly
accessible to people as needed across the board, yet at the same time standardized.
Like it might need its own association or something, for standardization but also
because things work better if there is some sort of general agreement.