This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Breakthrough Failures (Part Two)

Read Part One: At Structure 08, GigaOm's Alistair Croll posed a question to the panel: "How much of scalability is architecture, and how much is throwing servers at the problem?"

"The cheapest way to scale is adding servers, but over time the product is really what drives the infrastructure," said Jonathan Heiliger, the VP of technical Operations for Facebook. "If the product is bad, that's going to cause problems that are hard to engineer your way out of. When we added chat on Facebook we actually built a new backend for that." (Read more about this in a post by Eugene Letuchy on the Facebook Engineering Blog.

Scaling rapidly sometimes means deciding between writing custom software and using existing open source or off-the-shelf solutions. "You don't want to have to reinvent the wheel," said Sandy Jen, co-founder of Meebo, the messaging service with 30 million users. "A lot of the stuff that we launched with is open source. At the same time, you have to build a lot of stuff yourself. No one knows your system like you do and no one can scale your system like you can."

Jeremiah Robison, Chief Technology Officer of slide show widget provider Slide, echoed that sentiment. "We built our own object-aware caching system, because our core value was delivering photos faster than anyone else," said Robison. "Anything that's not your core value you can take off the shelf. It's the backend systems that are causing the scalability problems."

When it comes to backend architecture on massive sites, the conversation usually focuses on one area. "There's a lot of hard problems that have been solved," said Flickr's Allspaw. "Most of the hard problems involve databases. Database problems are hard. Whether or not clouds help, they don't help database problems."

MacAskill says Amazon's utility computing services have saved SmugMug more than $1 million. The company operates four data centers, but also stores more than 600 terabytes of data at Amazon S3. MacAskill looks forward to seeing cloud solutions evolve to lighten the infrastructure load on companies. "I would love to not have any more data centers ever," he said. "My business is storing, sharing and delivering photos. Managing storage arrays doesn't help our customers."