Want to reduce your cloud costs 70 percent? Here’s how

Colocation, which means buying your own hardware up front and running and managing it in a third-party site, is not usually seen as a cheaper alternative to cloud. But, oddly enough, it can be.

Last week I compared cloud instances against dedicated servers showing that for long running uses such as databases, it’s significantly cheaper if you go with dedicated servers, but that’s not the end of it. Since you are still paying for those server resources every month, if you project the costs out 1 or 3 years, you end up paying much more than if you had just bought the hardware outright. This is where buying your own hardware and colocating it becomes a better option.

Advertisement

Continuing the comparison with the same specs for a long running database instance, If we price a basic Dell R415 with x2 processors each with 8 cores, 32GB RAM, a 500GB SATA system drive and a 400GB SSD, then the one-time list price is around $4000 – more than half the price of the SoftLayer(s ibm) server at $9,468/year we came up with in our previous analysis.

Of course, the price you pay SoftLayer includes power and bandwidth and these are fees which depend on where you locate your server. Power usage is difficult to calculate because you need to actually stress test the server to figure out the maximum draw and then run real workloads to see what your normal usage is.

My company, Server Density, just started experimenting with running our own hardware in London. We tested our 1U Dell(s dell) with very similar specs as discussed above was using 0.6A normally but stress tested with everything maxed out to 1.2A. Hosting this with the ISP who supplies our office works out at $161/month or $1932/year (it would work out cheaper to get a whole rack at a big data centre but this was just our first step).

This makes the total annual cost look as follows:

Remember, again, that this is a database server so while with Rackspace(s rax), Amazon(s amzn) and SoftLayer(s ibm) you pay that price every year, after the first year with colocation the annual cost drops to $1932 because you already own the hardware. Further, the hardware can also be considered an asset which has tax benefits.

Server Density is still experimenting at on small scale but I spoke to Mark Schliemann VP of technical operations at Moz.com because it runs a hybrid environment. Moz recently moved the majority of its environment off of AWS and into a colo facility with Nimbix but still uses AWS for processing batch jobs (the perfect use case for elastic cloud resources).

Moz worked on detailed cost comparisons to factor in the cost of the hardware leases (routers, switches, firewalls, load balancers, SAN/NAS storage and VPN), virtualization platforms, miscellaneous software, monitoring software/services, connectivity/bandwidth, vendor support, colo and even travel costs. Using this to calculate their per server costs means on AWS they would spend $3,200 per month vs $668 per month with their own hardware. Projecting out 1 year results in costs of $8,096 vs AWS at $38,400.

Moz’s goal for the end of the first quarter of 2014 is to be paying $173,000 per month for its own environment plus $100,000 per month for elastic AWS cloud usage. If it remained entirely on AWS it would work out at $842,000 per month.

Sometimes cloud is just harder

Optimizing utilization is much more difficult on the cloud because of the fixed instance sizes. Moz found it was much more efficient running its own systems virtualized because it could create the exact instance sizes needed. Cloud providers often increase CPU allocation alongside memory when in real world uses you tend to need one or the other. Running your own environment allows you to optimize this and was one of the big areas Moz have used to improve their utilization. This has helped them become much more efficient with spend.

Right now we are able to demonstrate that our colo is about 1/5th the cost of Amazon but with RAM upgrades to our servers to increase capacity we are confident we can drive this down to something closer to 1/7th the cost of Amazon.

Colocation has its benefits once you’re established

Colocation looks like a winner but there are some important caveats:

First and foremost, you need in-house expertise to build and rack your own equipment and design the network. Networking hardware can be expensive and if things go wrong, you need to deal with the problem. This can involve support contracts with vendors and/or training your own staff but does not usually require new hires because the same team that has to deal with cloud architecture, redundancy, failover, APIs, programming, etc, can work on the ops side of things running your own environment.

The data centers chosen have to be easily accessible 24/7 because you may need to visit at unusual times. This means having people on-call and available to travel, or paying remote hands at the data center high hourly fees to fix things.

You have to purchase the equipment upfront which means large capital outlay but this can be mitigated by leasing.

Personnel costs not as different as you think

So what does this mean for the cloud? On a pure cost basis, buying your own hardware and colocating it is significantly cheaper. Many will say that the real cost is hidden with staffing requirements but that’s not the case because you still need a technical team to build your cloud infrastructure.

At a basic level, compute and storage are commodities. The way the cloud providers differentiate is with their supporting services. Amazon has been able to iterate very quickly on innovative features, offering a range of supporting products like DNS, mail, queuing, databases, auto scaling and the like. Rackspace has been slower to do that but is now starting to offer similar features.

Flexibility of cloud needs to be highlighted again too. Once you buy hardware you’re stuck with it for the long term but the point of the example above was that you had a known workload.

Considering the hybrid model

Perhaps a hybrid model makes sense, then? This is where I believe a good middle ground is and we can see Moz making good use of such a model. You can service your known workloads with dedicated servers and then connect to the public cloud when you need extra flexibility. Data centers like Equinix offer Direct Connect services into the big cloud providers for this very reason, and SoftLayer offers its own public cloud to go alongside dedicated instances. Rackspace is placing bets in all camps with public cloud, traditional managed hosting, a hybrid of the two and support services for OpenStack.

And when should you consider switching? In September, Dell(s dell) cloud exec Nnamdi Orakwue said companies often start looking at alternatives when their monthly AWS bill hits $50,000 but maybe even that number is too high?

Like your last article this one is just as awful and misleading. You have done nothing more than focus on technology, typical techy argument for justifying why in house is better than outsourcing. Cloud computing is not just be about costs, there are many dimensions to consider when making the business case for or against cloud computing. Every organization is unique and your approach is over simplified.

You really need to do your homework on making a proper business case to support your arguments.

I expect we will see more of this kind of article once the Cloud Backlash starts. There have been very few so far. Basically the cloud has some uses, but like every other technology, they have limited application.

For start-ups and other short term projects, the cloud can be terrific. But even those projects must be created in a way that you can easily transport them so you don’t get trapped in cloud services, because many short term and start-up projects become long-term.

Furthermore, most decent Operations Departments can boast much better uptime than most cloud services. When I was challenged about outsourcing our Exchange system at one job, I just showed management my uptime record for mail. It was 4 or 5 9’s. I don’t remember which. Most cloud services pledge 3 maximum.

So there are plenty of other ways to do things.

The cloud is just an optimized non geo-specific but geo-sensitive session-aware redundant failover cluster with a simplified interface.

People initially get sticker shock from their public cloud usage as they typically do not have any tools deployed to understand how they are using their public cloud. Once they deploy tools from companies like Cloudyn, Scalr, etc… they soon realize what they are doing and how they can optimize their usage and reduce costs. Once you take this into consideration you see remarkable savings materialize along with CPU optimization.

Unfortunately this article omits the new generation cloud providers, which are able to drive the cost argument back into their favor, even in the context of long term use provisioning.

First the new generation cloud providers remove the resource bundling completely so you can provision and only pay for the exact amount of resources (RAM, CPU, Storage) consumed, measured at a highly granular level and independently of each other. You can then also modify the quantities at any given point in time, dynamically. The usage is charged by the minute, or by resource pool subscriptions for long term uses (pool meaning the resource quantiy provisioned is not VM specific, it is for the entire environment). This alone already offers huge flexibility and cost savings over all the traditional cloud providers who only offer fixed server configurations.

But it doesn’t stop here: Next they introduced high performance computing elements that make the VM you provision much more powerful than the same configuration in colo or a first generation cloud – this offers an additional huge source of cost optimization, as you can basically achieve the same with less resources and therefore lower cost.

One example is the complete replacement of slow and error prone HHD storage with new generation SSD technologies (which btw are sold at the same cost as HHD). The new SSD offer IOPS SLA and are also designed from ground up to serve in a multi-tenant public cloud environment, therefore also avoiding the typical and very costly storage bottlenecks in the traditional public clouds. Another example of cost benefiting innovation is the implementation of full SDN, offering high network performance, configuration options and optimizations simply not available in the traditional public clouds.

Add it all up and the you will find the very bottom line – price performance ratio simply un-matched compared to colocation and traditional cloud providers. These new generation cloud providers are readily available, in production and running enterprise class implementations live as we speak.

In my last 10 years as a systems engineer I worked with traditional environments like colocation or housing, and with public or private cloud environments, and I want to say that I agree with David , Mark Schliemann, Mike and Jaime.

Some of you are comparing AWS or others public cloud with ServerDensity or Moz, in my opinion it has not sense, it’s true that AWS has a lot of engineers and they are pushing the public cloud technologies, but you should think that David could not really need most of the these technologies, may be he has a controlled stack and a well designed roadmap for the next years (as Moz said) and may be the upfront is not a problem because he had a lot of money (funding?) .

I listened a lot of times that I will spend the money in engineers instead of a public cloud company, that’s partially true, I will spend the money in engineers, but less money, and the big difference is the knowledge!! I will keep a huge knowledge about my environment (and more) with the company.

A lot of people said that big public cloud company can manages these cloud environments better than small companies, obviously they are underestimating the professionals that prefer to work in small and startup companies, you don’t really need thousands of engineers to make the best. Systems engineers got 99.99% uptime per year several years before public clouds.

In my opinion

Use public cloud when:

– You have just begin, just and idea or very small startups
– Your business model makes huge traffic peaks or force that your infrastructure grows or shrinks very very fast (less than a 2-3 days)
– You don’t have enough money in the bank for the upfront
– You want to make different prototypes because your projects is not absolutely defined.

Use colocation/private cloud when:

– You have just leave the startup zone or you pay 30k-40k $/month, I know that is a blurred line.
– You don’t need a lot of SaaS or PaaS that public cloud offers.
– You have the knowledge (in your team) about networking, firewall, switches, virtualisation, etc
– You prefer to spent the money in the team instead of other companies.
– You want to keep all the control over the platform
– You want to avoid noisy neighbours because is dangerous for your business (not present in all public clouds)

Over time, a subscription will at some point “cross over” and be more expensive compared to a capital purchase. Add in a factor of a very consistent workload, I’d be surprised to see a cloud model be less expensive over a year or two.

When customers use subscription services, they are paying for flexibility, especially in the instance of a highly varying or fluctuating workload.

Very good point, and this is where you can really optimise the cost on AWS et al, using it for those flexible workloads where you can take advantage of the hourly or minute pricing. Excellent flexibility for huge numbers of servers at very low cost for short term workloads.

The most expensive part of infrastructure is the people that manage it. Not the kit or the power.

AWS is more expensive that dedicated because they do a lot of the undifferentiated heavy lifting for you. I for one want me team focussed on our value-add; not playing around with server configs. And I want the flexibility to scale.

One reason Rackspace are more expensive than AWS is that they put more people per customer than anyone else. And people are expensive.

I recently completed an apples to apples comparison of Amazon EC3, Rackspace, Azure, and Co-location for a full virtual SQL and Web server (Microsoft based). Co-location on shared servers was less expensive than any option with the same power (4 core or more, 16gb ram, 1TB disk space) on any other platform. Co-location on lease-to-own servers was about the middle of the pack, with costs dropping off dramatically once the lease converted.

Let me start by saying that I sell Cloud, Managed Hosting and co-location and have done so for 15 years. Each has their uses and wether it is fit for purpose depends on the client’s maturity and size.

Why choose co-lo? if you have a full CMDB solution, if you have the technical support staff and are local to the datacenter then co-lo might be a good fit.

However, if you only have a few servers and aren’t local any physical inspection of the device will cost you between $150 and $250 per hour … that quickly adds up (and easily forgotten during budgeting).

To dismiss “staff costs” as “well you need technical staff anyway” is very risky. In any IAAS model you still need technical staff, that isn’t the point. The point is staff-profiles, how many different profiles do I need to support my infrastructure? Sure a DBA can replace a hard-drive, but while he’s traveling and executes the work he can’t support dev-ops!

To support any managed hosting and cloud platform we need about 21 different profiles (think network, storage, security, hypervisor, monitoring, tier-1/2 support, CMDB, compliance, change/incident/problem/capacity/service management, managers, etc.) … not hard to find, but pretty hard to find in a single person and 24/7 … this is what a managed hosting & cloud provider brings to the table (not the technology or even the datacenter)

A hybrid co-lo + cloud seems an ill fit (the operations models don’t fit), from my experience the hybrid solution should be “managed hosting + cloud”, both have the a similar support structures and depending on the performance and operational needs managed dedicated hosting might still be better.

Agreed that if you only have a few servers then using AWS compute or a managed provider makes sense – that’s not the right level of scale to make it worth dealing with hardware and in particular, networking.

If you’re not a tech company then having an in house team will be expensive. If software and development isn’t your core business then you’re right that hiring the expertise to run your own servers is probably not worth it. AWS style compute is probably not appropriate because you need the same kind of expertise so just let a managed hosting provider deal with it all.

The example of my own company and Moz makes that assumption – we’re both core tech software companies which I should’ve included as a point in the article – thanks for bringing it up!

I agree that in most cases you will not lower your hard costs but you need to think of Cloud/Hosting from the standpoint of Strategy, Process, People and Technology that ENABLE BUSINESS AGILITY.

Question for the Business folks: Is IT getting in the way of your business?

I highly recommend engaging http://www.ColoAdvisor.com before setting out to make a move to Cloud, Colocation or Managed Hosting Services. The choices are so numerous that even I get a bit confused at times.

I agree with David: many people believes you only need to plug into cloud and fire the IT staff. Cloud is not magic, you still need sysadmins, DBAs, etc. There is also another option to evade HW staff: hosting. Having said that, I believe hybrid cloud may be the best option.

Here’s a bit of insider scoop. You lease space at a data center for a couple of years. You move in all your gear. Your lease comes up for renewal. What do you suppose happens? If your guess is, “they obscenely jack up the price knowing how painful it would be for you to move,” then you are a winner. I worked for a while managing large-scale storage at Kodak Gallery and the majority of my job involved moving data from one data center to another and coordinating equipment moves to deal with leases coming due.

This is what contracts are for – you get the price increases held by the terms. In fact, this is default in the contracts I’m looking at from Equinix right now:

“Notwithstanding anything in this Agreement to the contrary, after the first twelve months of the initial service term, Equinix may change the Service Fees for power Services at a rate not to exceed five percent (5%) per year unless Equinix’s direct electrical supply costs increases by more than five percent (5%) per year, in which case Equinix may increase the Service Fees by such increased cost.”

There’s some wiggle room in there for Equinix but that’s what negotiations are for.

So according to this definition, “Cloud” just means “on the Internet?” Before cloud became a trendy marketing term it meant high availability through redundancy, address abstraction, run-time resolution of addresses, and ubiquity within its defined scope of service.

And yes, it absolutely requires significant engineering work to get redundancy.
And yes, “cloud” providers have had mixed success in providing that.

But that doesn’t all of a sudden mean a CoLo server = cloud. The article title should be “Save 70% on your cloud infrastructure by implementing something that doesn’t come close to qualifying as cloud but calling it that anyway.” I hope you have a pointy-haired boss to sign off on that.

Marketers have co-opted and diluted the term “cloud,” taking advantage of the lack of formal definition of terms in order to sell stuff to unsophisticated buyers. We expect that from them but not from journalists who know the space, and not from Gigaom. Hugely disappointed.

The point of article titles is to grab your attention, that’s how they work. If you actually read the article you’ll see there is a specific definition being used in the comparison – compute, networking and to some extent storage.

You’re right that “cloud” is a marketing term and has a broad definition. This article is specifically not about all the supporting services the likes of AWS (and more recently, Rackspace) provide like load balancing, DNS, databases, etc. Those are much harder to compare.

I’m not sure the reply helped your case, David. Western Digital’s “Personal Cloud” is a single-spindle small-form-factor computer designed to tunnel through UPnP routers. It does provide run-time address resolution but only because home users typically have dynamic IP or are behind carrier-grade NAT. It is as almost far as one can get from what the term cloud is supposed to be. But Western Digital will sell a ton of “personal cloud” devices by redefining the term to mean “you can get to it on the Internet” and targeting a market of lay persons who think they are getting a miniature version of Enterprise-grade technology.

So when you say “The point of article titles is to grab your attention, thatâ€™s how they work” it acknowledges the discrepancy between the headline and the content and suggests a strategy to get a ton of page views by co-opting the term cloud to targeting readers who actually believe that term means “you can get to it on the Internet.” I expect that from marketers but not from journalists. You appear to be arguing that I should lower my expectations of the Gigaom brand and content. Is that Gigaom’s editorial policy or yours?

Please don’t credit me with being “right that ‘cloud’ is a marketing term and has a broad definition” because that isn’t what I said. As long as I’ve been implementing cloud infrastructure it has had a more specific meaning which included high availability through redundancy, address abstraction, run-time resolution of addresses, and ubiquity within its defined scope of service. What I said was that the term had been co-opted by marketers – people who use the discrepancy between cloud as understood and practiced in the data center for the last decade vs. cloud as has been perceived in mainstream media to extract money from the market. That’s not a good thing you should embrace. If cloud now has a broad definition and if people *inside* the datacenter have begun to use the marketing definition of the term, then the result is significant degradation in the reliability and availability of commercial services…and the quality of journalism serving *that* market. The number of comments about how good the article is suggest that the marketing definition of cloud is taking over, that such degradation of service is happening, and that this article has directly contributed to that decline.

What is the reader to take away from your concluding statements that “this article is specifically not about all the supporting services” and that “those are much harder to compare”? Isn’t that exactly what investigative journalism is supposed to do? Take on the hard topics that readers do not have the skill or resources to do themselves? The headline offers me a way to save 70% on my cloud infrastructure and priomises the answer will surprise me. The answer turned out to be “don’t build an actual cloud infrastructure but host a stand-alone server in a CoLo” and did, in fact, surprise me, but not in a good way. If I as a reader accept your premise that reporting on actual cloud infrastructure is hard, am I to conclude that it is beyond Gigaom’s skills, beyond their funding, or that Gigaom’s editorial policy is to avoid hard topics?

It’s generally accepted within the tech community that if you say you’re “hosted on the cloud” then it’s likely to be on services provided by AWS (or some other competitor). At the core of these is compute. That’s that definition of “cloud” being used here.

I’m not really sure what you’re trying to say other than discussing the semantics of “cloud”, which isn’t the topic of the article and is irrelevant when there’s already an understanding of the meaning in this context. We’re clearly not talking about consumer “cloud” products like the WD Personal Cloud.

The point of the article is to address the misconception that infrastructure cloud services like EC2 are much cheaper than the alternatives. If that’s not clear I’d be interested to hear your suggestions for improvements. Likewise, if you disagree I’d like to hear your arguments why the analysis is wrong.

If you are proposing that the CoLo solution described *is* cloud, then it isn’t clear what criteria are used in the article to qualify something as “cloud.” Can you explain what makes the solution in the article qualify as “cloud?” In the context of this article, does “cloud” just mean “on the Internet?”

Or perhaps I’m reading too much into the title and it was never your intent to characterize a stand-alone, non-redundant CoLo solution as “cloud.” Perhaps the article is arguing that a business requirement for cloud can be scaled back and it was my mistake to assume the solution you offered was being presented as cloud. In that case, the article would be saying “why buy cloud if you don’t need it and can get by with a CoLo instance instead.”

So if you are saying that the CoLo solution described *is* cloud, it would help me a lot to understand what specifically qualifies as “cloud” as you are using the term here. In this context, cloud appears to mean “off-site, 3rd party hosting”. But if you aren’t claiming the CoLo solution is cloud, then I apologize for misreading the intent of the article.

I don’t think I said anywhere that the colo is supposed to be a (private) “cloud”, it’s meant to replace the compute/storage products from the likes of AWS, who are a cloud vendor. Perhaps the title is a little confusing on that point but what it is trying to say is that you can reduce your cloud costs by 70% by switching to a hybrid model. The Moz example is doing that – they are now spending $100k on colo and $173k on AWS vs $842k on pure AWS. This is where the “reduce” comes from.

A private cloud would imply offering at least compute and storage on a virtualised basis with access to non-ops e.g. via OpenStack projects, and probably also include other PaaS features like DNS, queuing, databases, etc. That’s not what we’re looking at here.

OK, thanks. That clears it up. Based on the headline I assumed “reduce cloud costs” meant I still had a cloud but that it was less expensive. I was also suffering from the assumption that “cloud” was used in the traditional sense meaning something that was made highly available through redundancy, abstract addressing and dynamic resolution.

Now I understand that the intended result is a hybrid and not attempting to characterize a stand-alone, non-redundant CoLo server as “cloud.” So if my client has business requirement for cloud in the traditional IT sense, this is not it.

The reason I mentioned Western Digital is that their definition of cloud is primarily compute and storage, and omits as a requirement those features that provide HA. The marketer/consumer-grade cloud, in other words. My wristwatch is primarily compute, storage and display, it’s not a high bar to set. Your working definition of cloud as virtualized compute and storage with references to OpenStack and other services seems closer to Western Digital’s definition than the traditional meaning of the term in which some HA features were the whole point. That also helps to put the article in context. Thanks.

I cannot say that I have ever recommended a client to use any kind of cloud services other than a static file CDN. I think one thing that is not realized by a lot of people is the virtualization used loses a lot of the processing power. Also I have noticed the huge latency times when accessing databases.

I recently ran hardware in a colo for over 3 years. This is so wrong it’s hard to know where to begin, but I’ll go with the topic of redundancy. When you pay Amazon (or the like) for a server, although it feels like a single server, you’re actually paying for a mountain of complex virtualized infrastructure that ensures that one server is always up. To do this on your own with anywhere near the same amount of reliability requires 3 servers all running an expensive virtualization platform and all connected to a SAN and a redundant pair of managed switches, and redundant firewalls. Even after all of this is bought and configured properly (which requires a huge amount of expertise across multiple disciplines), it probably won’t come close to a cloud providers’ uptime.

You’re assuming that the cloud provides magic redundancy. That’s not true and requires significant engineering work to get redundancy – just look at the time and effort Netflix has put into building a range of extra software tools to help them run things. This is a very commonly held belief that the cloud provides redundancy for no effort, and it’s totally incorrect.

You gain redundancy in cloud or non-cloud environments by having multiple servers or devices, and in both situations you have to pay for them. Buying the hardware outright is significantly cheaper in the long run (and the short term if you lease) and you can often buy significantly higher spec machines so you can actually run fewer of them.

At Server Density we’ve been planning out switching from Softlayer onto our own hardware and even with redundant VM hosts, switches, routers and data centers, we’re going to be running costs by around 60-70%.

Reliability can actually be higher with your own hardware because you get to control the components. When on the cloud you have no idea what the underlying architecture is. There are many instances of poor reliability of AWS components, especially EBS. We’ve seen far more issues with our cloud instances at Softlayer than on any of the dedicated servers we have with them.

I think you are saying the same thing as Jake – it takes a significant amount of work beyond just buying the hardware. So the calculations based purely on hardware might just be the tip of the iceberg. People costs and maintenance costs need to be factored in .

That was the point of including the second example from Moz in the article because they have factored in every cost: hardware, software, licenses, training, travel, etc.

My first example was continuing on from the earlier example comparing the pure hardware costs, where buying your own is much cheaper. As you say the extra costs as mentioned above also need to be considered which is why I included the example from Moz.

All this shows how much cheaper it is to run things yourself in a colo environment.

Anecdotal evidence from one company’s experience is hardly conclusive, plus they have not even reached their goals; they only have projections.

For the numbers to even begin to match up you have to compare buying your own hardware vs. a three-year commitment for AWS services, which reduces their cost by 75%. So if you are OK with a three-year investment, as someone following the advice in your article clearly would be, then why not save 75% committing to AWS rather than 70% to self-manage and deal with a bunch of extra work? Plus you have made the assumption that Moz’s future infrastructure will be just as solid as Amazon’s and that is quite a big assumption. Ever had to deal with a data center whose HVAC dies on a Sunday and you have to turn your servers off or watch them melt? I have. If the team Moz has to manage ops can do no better with AWS than they can with self-managed then they have some deep issues or they have no idea what they do not know or both.

I am actually a fan of hosting one’s own infrastructure, but only at a scale where it is the obvious choice. Google will turn up any number of articles about how much companies have saved by moving to the cloud; heck, AWS has their own article about several companies saved an average of 70% moving to AWS. http://aws.amazon.com/economics/

We have much more than projections. Sure our projected plan isn’t completed but we’ve crossed the 50% point. In fact last week we migrated another one of our larger services that was running about $150K/mo in Amazon into our colo. Not only is it significantly cheaper for us to run now but it’s been much more reliable and faster across the board.

The Moz numbers are 3 year projections and are far more than just an anecdote. I’m providing actual evidence from real cost calculations comparing what is being spent now on AWS vs what has already been spent to move off AWS and what the final projected cost will be. As Mark said, the project is nearing completion so there’s unlikely to be anything unexpected.

You deal with a HVAC failure, or any failure, by having redundancy across multiple DCs. Just by using AWS doesn’t mean you get magical redundancy with no effort – this is perhaps the biggest error I see people assuming. You have to do the same work to get redundancy on AWS as you do with colo i.e. multiple regions/DCs. The way you implement it is different.

There’s always going to be unexpected things. Just look at the failures Netflix has had with AWS – always something unusual. The same thing will likely happen with colo too – you learn as things happen. Just with colo, the spend is so much lower you can afford to engineer around these things much easier, at least at scale.

I agree AWS is great for startups and low volume projects where you have just a few instances or make use of most of the platform services like RDS, queuing, mail, etc. Just when you hit scale it no longer makes sense.

We have done all of that, redundancy from the frontend to the backend. In the last year we’ve had more availability issues in Amazon than our own colo. Search for Amazon outages in the last year, some of those were extremely painful for us.

Yes agree 100% with Jake, and disagree 100% with David Mytton. I work for AWS. The amount of engineering AWS has done to reduce cost basis and have high reliability, durability and security is staggering. I wish I could provide some details of the level of investment and the cool engineering work (such as our own internal SDN and router hardware to scale), but can’t for obvious competitive and IP reasons. Take a look at James Hamilton’s talk at re:invent for a peek – even that talk does not reveal much due to confidentiality reasons. Suffice it to say that if you want to achieve the same combination of features, quality and price, you would need to do a heck of a lot of engineering. Yes anyone can buy a server or storage device (maybe even cheaper in some cases just in isolation), but that is very very different from the sophisticated end-end cloud infrastructure AWS has, that buys the customer a ton of benefits. You have to compare apples-apples, and this article is far far from that. And even leaving aside the engineering effort that has already gone in to deliver the services, AWS has large operations teams that are supporting the infrastructure, fixing issues and constantly optimizing performance and reliability. Add in multi AZ and region capabilities, the plethora of services that are all nicely integrated, the exploding third party marketplace, and there is even no comparison.

Combine AWS’s massive engineering investment with the drive towards running low margins and passing on the savings to customers, and you have unbeatable prices. There are already many case studies where customers quote savings when moving from on-prem to AWS. Check out the website.

A more apples-apples comparison could have been with Azure or GCE. I don’t even know why I bothered responding to a patently flawed article such as this. No more responses from me on this, not worth dignifying it further.

It’s true that AWS is leading in innovation and I don’t think any other cloud provider is anywhere close to where AWS is, and the speed of releases is impressive. No doubt the level of engineering required to maintain the infrastructure is very high and this does show with the kind of features that are available. And so if you want to buy into the entire AWS stack then it’s possibly worth it, with features like RDS so you never need to manage your database, DynamoDB, mail, DNS, etc.

But this article isn’t about any of those things – it’s about pure compute, networking and Moz are running their own storage. For these things, AWS is not better. It’s not cheaper. The level of engineering behind the scenes is irrelevant because these are commodity services and are the core of what most people need.

When you get to a certain level you don’t want to be using a black box platform. You want control and customisability and then all that really matters is the basic compute, networking and storage.

This article uses Moz as an example and it just so happens they were previously (and still are for some workloads) on AWS. The earlier article actually looked at Rackspace and Softlayer (who were also mentioned in this article) so the comparison is valid.

It’s a shame you feel like you don’t want to engage in a debate on this. This seems like a common position to those who are pushing the cloud dogmatically, as if there’s nothing better and you shouldn’t consider anything else. It’s like a religion – just ignore the critics even when there is intelligent debate to be had. In actual fact, there is a direct comparison to be made for the basic core services and as my analysis shows, backed up by evidence from Moz, colo is cheaper. It’s much more difficult to compare the supporting services, which is why I didn’t.

I’m not anti cloud or anti-AWS – just believe it’s being touted as the one true solution when if you look closely, that’s not actually true. I’d like to see more people discussing this.

I would accept your numbers in a blink of an eye without any arguments if it was any other company involved, but once it is Amazon, the whole calculation changes, because Amazon is known for cutting costs to the bone, and also pass them along to the consumer.

You are saying that the savings are just because of outright buying the hardware and everything else is same. You are saying that the personnel costs are basically the same because you anyways need people to manage the cloud, however, that’s not exactly the case. With cloud, you are only dealing with software and the tools are little more refined, However in the case of CoLo hardware maintenance is also your own responsibility and that is not exactly magic. There is a fine line here and crossing it easily means more personnel, say for every 2 guys dealing in software you need someone more hardware oriented, that is 50% extra right there. Your calculations are almost ideal, but in real-life , in critical services, a certain cushion is required.

Coming to what you’re right about, you say cloud providers either increase or decrease both cpu and RAM, when often the needs are asymmetrical, may be they should work on it.
And as for the savings on outright owning the hardware, cloud providers may, in the true sense of commodity service, give the option to buy the virtual cpu, RAM, and storage but and even help you cover it in monthly over a period like T-Mobile plans these days.
So, 70% is definitely not there. More like 20%, that too vastly depends on the application.

As regards people, you need an ops team regardless of whether you’re on the cloud or have your own hardware. These are the kind of people who likely have experience with both areas and if not can easily switch and learn what they need. This was exactly what happened at Moz – they didn’t need any new people to work on their own equipment.

The basic hardware comparison between EC2 and Dell is ideal but the real world example I gave from Moz goes against what you’re saving. Even if you think the calculation is a little off (it’s not) then there’s still such a huge saving that an extra cost is easily covered.

The 70% is definitely there and is demonstrated by the examples. Feel free to comment with your own calculations showing where you got the 20% figure from.

that 20% was not an exact number I was just using it to say that savings might be there but not such a big number like 70%.

However, here’s what I cloud is best suited for. The Cloud definitely offers savings in certain situations. You should have capacity in house to serve up to average computational demands and whenever the needs cross the average demands, the task should should be sent over to the cloud on demand. That’s(Cloud on Demand) where the savings show up.

The figures in the article account to around 70%, both in the more hypothetical example I use with the single server comparison but also with the real world example from Moz. The 70% number isn’t just hyperbole.

But agreed that the cloud is cheaper in some situations i.e. elastic workloads like processing or handling traffic spikes. I’m trying to show in the article that it’s not cheaper for long running consistent workloads, which all apps have. Moz is a great example of this, running their main workload on colo and using AWS for elastic compute.

The amount of dedicated hardware time is much too high in your estimation in my experience. Yes, though, it requires more expertise, but after a certain size, companies should have it. Hardware is only a question when it’s purchased installed or fails. Those occurrences are rare.