Blog

Hadoop is Hard! But Big Data Doesn’t Have To Be

By
Jonathan Buckley

June 25, 2015

When it comes to big data analytics, Hadoop has been heralded as the all-in-one solution for the enterprise. And while the many benefits of Hadoop adoption tend to support all the praise, the reality is that organizations that attempt to manage Hadoop themselves quickly discover that doing so is flat out hard, if not impossible. Unfortunately, this is why so many big data initiatives stall before they ever get off the ground. Still, just because Hadoop is a hassle doesn’t mean that big data has to be. And that’s where a managed Hadoop provider like Qubole comes in.

But don’t take our word for it. Here’s what others are saying about the difficulties of managing Hadoop, and why a managed provider makes sense for most businesses.

Hadoop is Hard

According to Todd Papaioannou, the former Chief Cloud Architect at Yahoo who was quoted in a recent article on Gigaom.com, “Hadoop is hard – let’s make no bones about it.” Papaioannou said he learned that one big lesson after he and his team were tasked with setting up 45,000 Hadoop servers on the company’s 400,000 node private cloud.

What made Hadoop so “damn hard to use? It’s low level infrastructure software,” said Papaioannou, “and most people out there are not used to using low-level infrastructure software.”

A May 2015 article on Fortune.com sheds further light on why Hadoop is so hard to handle. According to a new Gartner survey mentioned in the article, “Just over a quarter (26%) of (284) respondents said they are deploying or experimenting with Hadoop and only 11% said they plan to invest in Hadoop within 12 months.” Among the reasons given for the apparent slowdown in Hadoop adoption, Gartner analysts said that a number of respondents listed Hadoop as a low priority, while others felt that it was “overkill.” It was also concluded that, “…a persistent shortage of Hadoop skills is hindering adoption.”

Another take on the results of the Gartner Hadoop survey was provided in a recent article on ZdNet.com titled, “Why Hadoop is Hard, and how to make it easier.” In the article, tech writer Andrew Brust explains that he’s not surprised that Gartner’s findings show that corporate adoption of Hadoop hasn’t kept up with the hype.

“For almost any new technology,” Brust explains, “there’s typically a big differential between what the tech journalists and analysts are implying everybody’s doing with that technology and what…everybody’s doing with that technology.”

With regard to Gartner’s statistic that only 26 percent of respondents said that they are “already deploying, piloting, or experimenting with Hadoop”, Brust says that he thinks 26 percent is a very promising number. As for why Brust explains that, “Hadoop’s legacy is that of a specialist’s tool, not an Enterprise tool.” That being said, Brust believes that, “26% penetration is pretty good, and it’s going to get better.” According to Brust “it’s mature analytics tools for Hadoop, DBMS abstraction layers to Hadoop, and Hadoop-as-a-Service cloud offerings that will make Hadoop actionable for the majority of technology users.”

Hadoop Doesn’t Stand Alone

Unlike other enterprise systems—such as salesforce automation, accounting applications, or platforms like the data warehouse—which come with ready-made game plans for running in production that include process, technology, and availability of skilled staff—Hadoop does not arrive with a game plan for running a variety of different types of workloads at scale straight out of the box.

In a May 2015 article on the Forbes website Dan Woods, a contributing Forbes writer, explains that the misconception that Hadoop is a stand-alone big data solution is what causes many company’s initial big data efforts “to crawl, sputter, or outright fail.”

Citing Netflix as a use case for making Hadoop effective and reliable at enterprise scale through its cloud-based “Genie” solution, Woods suggests that organizations will turn more and more to commercially available cloud-based Hadoop management solutions to help them successfully monetize their big data.

Adoption Troubles

Hadoop hasn’t lived up to all the hype. And in light of the challenges that organizations face in attempting to manage Hadoop, there are signs that Hadoop is having adoption troubles. In a recent article on Infoworld’s Tech Watch blog, author Matt Asay discusses how “Hadoop demand falls as other big data tech rises.”

To support his claim, Asay references the aforementioned Gartner survey analysis. In the report Gartner says that 54 percent of its 284 global IT and business leader respondents said that they had zero plans to use Hadoop.

Commenting on this statistic, Gartner analyst Merv Adrian said, “With such large incidence of organizations with no plans or already on their Hadoop journey, future demand for Hadoop looks fairly anemic over at least the next 24 months.”

Steven Norton, a writer for the Wall Street Journal made similar observations about the Gartner study in his May 2015 article titled, “Hadoop Corporate Adoption Remains Low: Gartner”. In the article Norton says that, “Implementation and deployment hurdles inside big companies aren’t unheard of, and some CIOs have noted they are taking a cautious approach to Hadoop adoption.” According to Gartner, “Hadoop adoption remains at the early adopter phase, where skills and success are still rare.”

Turn to The Cloud

Clearly Hadoop is hard. But thanks to cloud-based Hadoop managers such as Qubole and others, big data analysis for profit and competitive advantage doesn’t have to be. Joe McKendrick, a contributing Forbes writer who commented on a panel discussion held at the May 2015 Data Summit in New York, tells us that scalability and the ability to pay as you go are among the many benefits of the cloud that are attracting today’s businesses.

Bernard Marr’s Forbes article—“Big Data As-A-Service Is Next Big Thing” —is even more enthusiastic about the prospects of cloud vendors going forward.

Stating that the global big data market is predicted to be worth $88 billion by 2021, Marr’s says that the forecast value of Big Data–as-a-service (BDaaS) could be as much as $30 billion.

The experts have weighed in and one thing is certain: Hadoop is hard, but it isn’t going anywhere. Thanks to cloud Hadoop management providers, organizations both large and small can enjoy the benefits of a big data analytics strategy.

Are you wondering if big data in the cloud could be for you? Learn how one digital advertising company uses big data in the cloud in this webinar.