Cloud or On Premise for Data Analytics: How to Choose

January 9, 2019

&nbsp
5 m, 16 s

GitLab, Dropbox and WhatsApp. All of them started on the cloud, and all of them moved back to private cloud or on premise infrastructure. Why? We sent our content team to find out. Two weeks (and several gallons of coffee) later, here are the results of their research.

These 4 factors are driving companies out of the cloud:

Cost savings

Data control

Power

Performance

In this article, we’ll briefly cover each factor and the implications for your company. As you read, bear in mind that we approach this from the perspective of natural language processing (NLP), since that’s our focus. However, these lessons are applicable to all forms of data analytics.

Factor 1: Cost savings

According to our research, the most common incentive for companies to get out of the cloud is money. For example, Moz saved $4.35 million by moving out of the cloud. Data warehouse MemSQL cut their three-year server spend by 80%. And Dropbox, the popular file-hosting service, reduced their operational expenses by $74.6 million over two years.

How? For most companies, it’s a matter of scale. Think about the cost factors involved in any data analytics project:

Cloud computing gained popularity based on claims of cost savings and convenience. But in reality, cloud services become far more expensive as companies thrive. As CNBC points out, “Relying on a third-party cloud provider is common for young companies, which have to choose their priorities. But cloud costs can add up as companies grow.”

As an aside, don’t be afraid to make the switch. All of the companies above began on the public cloud. Just be sure to mitigate risk by choosing an experienced private cloud or on premise provider.

We suggest that any company faced with a large cloud services bill should calculate their break-even for cloud and on premise. For example, we’ve calculated that on our platform, break-even comes at around 12 million documents per month. For more info, see the full report.

Factor 2: Data control

Alright, say it with me: money isn’t everything. Or rather, up-front cost isn’t everything. Because everyone should be worried about data control. Hackers have hit industries ranging from entertainment to hospitality to the Pentagon and beyond. (Is the Pentagon an industry? That’s a topic for another day and a different blog.)

The security risks of public clouds are well-documented. Indeed, our research into the state of cloud cybersecurity suggests that any company handling sensitive data should get out of the public cloud. But where should you go instead?

If you’re not ready leave the cloud entirely, you’re left with two choices: private cloud or hybrid cloud.

In private clouds (roughly synonymous with on premise), data services and infrastructure are dedicated to your organization. Hybrid clouds combine a mix of on premise infrastructure and public cloud services. We found that the hybrid approach is rapidly gaining popularity, particularly with small- to mid-sized analytics companies and enterprises who handle sensitive data. (TapJoy is just one example.)

Factor 3: Power

Cloud-based data analytics providers promise to make things easier. This can be great for smaller, less-technical teams. But in order to serve large numbers of users with the most convenience possible, cloud providers tend to optimize for simplicity.

When you outsource any technology, you outsource ownership of those features. Cloud providers want to deliver a solution that gets you up-and-running as quickly as possible, with the minimum analytics features you require. But when your product or project is built on this kind of technology, what competitive differentiation can you show?

Here’s what Meg Whitman, CEO at Hewlett Packard Enterprises had to say when asked why many enterprises are deciding the public cloud is no longer the best option for them:

“They want to scale to a hybrid environment that is developer-friendly and gives their business more control and better total cost of ownership.” – Meg Whitman, CEO, HPE

Of course, most cloud analytics solutions offer some level of differentiation for each user. But the scope of customization in the cloud just can’t compete with the deep access offered by on premise solutions. With on premise, you gain full ownership of the analytics performed, and can fit the results exactly to each end-user.

Not every company needs the power of an on premise solution. Indeed, not every team has the technical ability to utilize that level of access. But many companies see huge competitive benefits and lower customer churn after moving to on premise technology.

Factor 4: Performance

The fourth factor driving companies out of the cloud is performance, as measured by latency and bandwidth.

In data analytics, latency is the total time it takes to send, process, and receive the results for a single datum (such as an individual text document). Bandwidth measures the volume of data you can process within a period (for example, number of tweets per second).

Performance can be a key competitive differentiator for data analytics companies. But even the fastest cloud analytics services have some delay between submission, analysis, and reporting. In the end, no cloud solution can beat the performance of an on premise setup.

There are many complex technical details involved in these performance metrics. Much like with Factor 3: Power, we suggest you consult your engineering and product teams to determine whether performance should impact your “cloud or on premise” decision.

Cloud or on premise? How to decide

We hope this article has shed some light on why many companies are getting out of the cloud in favor of on premise infrastructure. But what should your company do? Start with this simple flowchart (click to enlarge):

Next, read through the complete report. The full version version covers each factor in more detail, and includes a number of guidelines to follow.

If you’d prefer to consult with a human, contact us to discuss your data analytics needs. Whether you’re a small business, high-volume data analytics company, or a data analyst team at a large enterprise, we’ll be happy to advise you.