My mission: Find technology for Early Adopters. Follow me: on Twitter @danwoodsearly on LinkedIn @ www.linkedin.com/in/danwoodsearly/ on myBlog @ http://www.CITOResearch.com. I am a CTO, writer, and consultant. For tech vendors, I help explain their technology. For users, I help find, select, and deploy new solutions that have explosive business value. I love to speak and share ideas.

Explaining Hadoop to Your CEO

When it comes to making a large investment to adopt a new technology like Hadoop, eventually the CEO must be presented with the pros and cons and approve the expense. Most CEOs start with a skeptical attitude for good reasons. They know that their technology staff really can’t wait to play with the latest stuff. They know that new investments in technology go south as often as they work out. The bloom is off the rose for investments in technology for its own sake, especially in a weak economy.

But most CEOs also know that other CEOs are making investments and aggressively trying to get an edge. Nick Carr’s advice that “IT Doesn’t Matter” hasn’t really taken hold. CEOs know that technology matters, but only if it is implemented properly to support and improve core activities of a business. Smart CEOs also know that with most powerful technologies, it requires some experimentation to figure out what the technology will mean to a business. Rookie CEOs and control freaks will ask for the charade of a hard ROI model, something that really is almost impossible to create with any degree of accuracy. And if these ROI models are so important, why are they never revisited after the technology is implemented? I’ll tell you why. These models were BS in the first place and so much is learned in the implementation that the upfront ROI models are almost always irrelevant. Also, when CEOs want to invest in something on a hunch, they don’t do much of an ROI model. They just call it strategic.

A seasoned CEO will look for a plausible path to significant value, along with the outlines of a plan for broad adoption if all goes well. The path to value should describe the initial questions that will be answered by the technology, the processes that will be improved, the decisions that will be made better with more information. The suspected business impact should be defined, but it should be clear to everyone that it is very likely that the unexpected impact may be even bigger. The amount of money to be spent should be presented with plan for the first few experiments that will be performed. With cloud computing infrastructure, none of this should mean a huge upfront cost.

The rising tide of new machine data sources that goes by the name of Big Data is now driving a lot of companies to ask: How can we make use of this data? This is where the conversation about Hadoop usually starts.

Hadoop was born because existing approaches were inadequate to process huge amounts of data. The itch Hadoop was created to scratch was the challenge of indexing the entire World Wide Web every day. Google developed a paradigm called MapReduce in 2004, and Yahoo! eventually started Hadoop as an implementation of MapReduce in 2005 and released it as an open source project in 2007.

Philip Wickline, CTO of Hadapt, a product for big data analytics built on Hadoop, points out that it is a mistake to think of Hadoop just as a MapReduce implementation. “To be sure, it started out this way,” said Wickline. “But Hadoop has transformed into a massive operating system for distributed parallel processing of huge amounts of data. MapReduce was the first way to use this operating system, but it will be joined by many other techniques.” The Apache Hive and Pig open source projects and the MapR product are all efforts to make Hadoop easier to use for particular purposes.

Much like any other operating system, Hadoop has the basic constructs needed to perform computing: It has a file system, a way to write programs, a way of managing the distribution of those programs over a distributed cluster and a way of accepting the results of those programs, ultimately combining them back into one result set. Wickline said that the next version of Hadoop will have more ways of writing programs beyond MapReduce, such as MPI.

Is Hadoop the Path to Value from Big Data?

So hooray, Hadoop can use a cool programming paradigm created by Google to crunch through huge amounts of data. The question is: Should a CEO be interested in this? In several industries that use Hadoop to process big data, the answer is “yes.” If you’re in financial services, the answer is “yes.” If you’re in defense or intelligence, the answer is “yes.” If you are in one of Silicon Valley’s elite search engine or advertising companies, the answer is “yes.” If you are in the manufacturing business, the answer will be “yes” pretty soon.

Post Your Comment

Post Your Reply

Forbes writers have the ability to call out member comments they find particularly interesting. Called-out comments are highlighted across the Forbes network. You'll be notified if your comment is called out.

Comments

Dan – my company Think Big Analytics provides Big Data strategy and execution for the enterprise. You points are spot on from our experience working with over 30 leadership teams to help them understand and incorporate Hadoop into their strategic initiatives. We see companies integrating Hadoop with other big data technologies, such as MPP databases and NoSQL edge serving. Clients often start using Hadoop to provide massive cost savings to the IT department in traditional data management but the CEO notices when there’s a wave of innovation based on the predictive analytics solutions that follow.

Great article Dan. It is also worth mentioning the HPCC Systems platform as an alternative to Hadoop. Unlike Hadoop distributions, HPCC is a mature platform and provides for a data delivery engine together with a data transformation and linking system equivalent to Hadoop. The main advantages over other alternatives are the real-time delivery of data queries and the extremely powerful ECL language programming model. More information at http://hpccsystems.com