How To Explain Hadoop To Non-Geeks

With Hadoop 2.0's arrival, you will need to explain the benefits of the reigning big data platform to business types and C-suite executives.

Big data is a popular topic these days, not only in the tech media, but also among mainstream news outlets. And October's official release of big data software framework Hadoop 2.0 is generating even more media buzz.

But while you, InformationWeek reader, clearly understand Hadoop's significance, there's a high probability that many people in your organization -- including more than a few managerial types in the C-suite -- aren't really sure what Hadoop is, what it does, or why it's important.

So, how do you explain Hadoop to non-geeks? One approach is to focus on the benefits of Hadoop and big data, rather than providing mind-numbing details (with forgettable acronyms) on how it all works.

Forrester analyst Mike Gualtieri took this "benefits" approach in June when he posted a brief tutorial video that provided an easy-to-grasp overview of Hadoop. He calls it a platform that makes big data easier to manage.

"To understand Hadoop, you have to understand two fundamental things about it," Gualtieri explained in his video. They are: How Hadoop stores files, and how it processes data.

He added: "Imagine you had a file that was larger than your PC's capacity. You could not store that file, right? Hadoop lets you store files bigger than what can be stored on one particular node or server. So you can store very, very large files. It also lets you store many, many files."

By focusing less on the jargon of Hadoop and big data, and more on the platform's real-world benefits, experts can effectively convey its value to business colleagues who do not have data-science backgrounds.

Mike Gualtieri explains Hadoop in a video posted on the Forrester blog.

"Mainstream business users don't need to know how Hadoop works," Gualtieri told InformationWeek via email. "But they do need to understand that the constraints they once had on storing and processing data are removed when Hadoop is installed."

As a result, "the business can start thinking big again when it comes to data," he added.

The barrage of news reports on all facets of big data, including its potential to fight various diseases, reduce government bureaucracy, locate terrorists, and on a more mundane level, help businesses sell more stuff, has helped introduce business people to Hadoop, even though a lot more education is needed.

"There is less confusion than there was 12 months ago," Gualtieri said. "Executives just know that it is a big data technology, and that is enough for them."

OK, so what's this "MapReduce" thing then? It's part of Hadoop too, right? As Gualtieri explained in his video: "The second characteristic of Hadoop is its ability to process that data, or at least (provide) a framework for processing that data. That's called MapReduce."

But rather than take the conventional step of moving data over a network to be processed by software, MapReduce uses a smarter approach tailor made for big data sets.

Moving data over a network "can be very, very slow, especially for really large data sets," Gualtieri added in the video. "Imagine if you're opening a really, really big file on your laptop, it takes a long, long time. It takes much longer than if it's a short, tiny file."

So rather than move the data to the software, MapReduce moves the processing software to the data. Hadoop is still very complex to use, but many startups and established companies are creating tools to change that, a promising trend that should help remove much of the mystery and complexity that shrouds Hadoop today.

"Hadoop innovation is happening incredibly fast," said Gualtieri via email. "The open source community and commercial vendors are working like gangbusters to make SQL access super-fast on Hadoop. That will open up connections from many other tools like Tableau, and other BI tools that interface to data using SQL."

But don't use that last paragraph to explain Hadoop to novices, please.

Emerging software tools now make analytics feasible -- and cost-effective -- for most companies. Also in the Brave The Big Data Wave issue of InformationWeek: Have doubts about NoSQL consistency? Meet Kyle Kingsbury's Call Me Maybe project. (Free registration required.)

Maybe it's a techie approximation of stereotypical CEO. Like many techies I know, they all have allergies when it comes to business users and business management.. like as if they are marginally a necessary evil.

I've been seeing a lot of artciles come through my inbox about Hadoop and unaware of what it is I found the video to hit the poinst to make me have a better understanding of what it is. The visual aid certainly helps when you can actually hear see the person speaking if that was on paper I would still be lost. Like any presentation, the message needs to be tailored for the audience.

Fully agree with your point - for non-geeks we need to use layman word and there is no doubt on it. But we need to be careful with the way to present the topic - some people prefer presentation while others would like to see the video. All in all, for non-geeks and business oriented people, we need to deliver them the message that big data can help business growth by using real life example. Hard-selling technical stuff will not make your CEO buy in.:-)

Forrester's Mike Gualtieri touches on a key point, the data sorting software is moved to the data locations on a cluster. What's also important is that Hadoop can expand to handle any amount of data and bring results on a big set as fast as a small one, thanks to its parallel processing. Just add servers to the cluster. Oops, the CEO just yawned. Quick, tell him and the CFO ithat it's much cheaper to use Hadoop than it is to use Oracle or DB2.

I wouldn't conflate keeping a presentation concise, free of jargon and very focused on the business benefit of a given tech (as opposed to how cool it is)with assuming that a CEO or CFO doesn't understand basic database concepts.

That's as may be, Lorna, but we've heard from many analytics experts that if you can't boil down a project pitch to 10 minutes or less, it doesn't even pay to approach the execs with it. Have you found things to be different in your experience?

I don't disagree with using plain English. But what line-of-business colleagues really want to hear is the language that relates to customer experience. I interviewed a senior IT leader earlier this year who told me she had a solid technology case for ripping out an important system, but got kicked out of the CEO's office multiple times -- until she made a short video that showed how the system related to what the customer experienced. Her CEO didn't just need plain English, he needed a visual. Adjust as needed for your culture.

The biggest challenge in this space will come down to keeping those explanations up to date. You Maybe just found the right balance in 'Explain it to me like I'm 5' and 'business use case' for your team, when bam, suddenly Hadoop goes 2.0 and the parameters they have understood have changed.

Not that big a deal - until you examine how fast that eco-system is evolving, it's not just a space dominated by Hadoop.