How Netflix Should Recover From Amazon Addiction

When a startup is in its youth, it is perfectly okay to say, “Amazon Web Services was down” when explaining a disruption in service. If FourSquare has to say that, no problem. For OpenTable, I’m less understanding. For OMGPOP, before it was acquired, it is okay to blame Amazon. For Zynga, no way.

Netflix is way beyond the period in which blaming Amazon is acceptable. Yet, the Christmas Eve outage of Netflix was attributed to a failure at Amazon. The company is rich with engineering resources. The company has the money. But does it have the will?

It is time for Mr. Hastings to follow suit and demand that Netflix pave the way to a new multi-cloud architecture for reliable scalability. Netflix doesn’t have to go it alone. There are plenty of partners who will help. But Netflix does have to want to get there.

Here’s how it could work.

Recovering from Amazon Addiction

Netflix uses Amazon Web Services in the first place because it provides a tremendous service. Through a set of automated APIs, you can build and configure massive computing capabilities in minutes, drive huge amounts of data and traffic through them, and tear them down just as quickly.

For startups on the way up, this is a godsend. Not only can they get going quickly, but they can scale if they have the good fortune to get a spike in traffic for some reason. All of this is well understood.

The amount of engineering to avoid failure at Amazon is awesome. The whole system is built to allow failure to occur anywhere with minimal impact. Amazon has redundancy at every level and the ability to spread your infrastructure across different data centers called availability zones. Amazon does a great job and failures are rare. In addition, Netflix has shown great leadership with its Simian Army and Chaos Monkey systems to verify its operational quality.

So until a startup is rich and successful, it makes sense to just live with your Amazon addiction and suffer the occasional outage with the “blame Amazon” explanation. Startup CEOs should be sure to tell everyone about this dependency in advance so that such a failure doesn’t come as a surprise.

But Netflix is beyond that stage. It is time for Netflix to start recovering from Amazon addiction and take full responsibility for the quality of its service.

A Vision for Multi-Cloud Scalability

What doesn’t make sense is for Netflix or any other company to attempt to replicate what Amazon does, unless it wants to offer its own service. The engineering would cost too much for Netflix to build its own version of Amazon Web Services. But that is not the only choice.

Here are the options that make the most sense in order of plausibility.

The first option is to introduce load balancing of traffic across infrastructure that runs at different cloud providers.

The easiest way to do this would be to stick with clouds that are based on the Amazon APIs. CloudStack has committed to providing a cloud based on the Amazon APIs. Eucalyptus already provides such an software infrastructure. Basho’s cloud storage product uses the S3 APIs. It appears that Amazon is not going to stop companies from using its APIs to create work-a-like services.

A harder but perhaps better way for the long term would be to create a set of APIs that abstract the use of cloud resources. You could then implement those APIs on Amazon, OpenStack, Google’s Cloud, or any other cloud. This would be more work, but it would solve the problem for the long term.