By submitting your personal information, you agree to receive emails regarding relevant products and special offers from TechTarget and its partners. You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

Hortonworks, formed with the venture capital firm Benchmark Capital, will focus on advancing the open source software’s current framework. Hadoop has grown increasingly popular because of its cloud computing technology and its ability to process and store large amounts of data coming from semi-structured and unstructured as well as structured sources. The new company will also offer support for businesses and vendors alike. Yahoo and Benchmark Capital have not disclosed how much each company invested in the new business.

Did you know?

Most are familiar with Dr. Seuss’ Horton Hears a Who, starring an elephant who swears to protect the diminutive people of Whoville while proclaiming, “A person is a person no matter how small.” Horton’s name is now featured in the Hadoop-focused company Yahoo and Benchmark Capital recently created called Hortonworks.

But this isn’t the first time a fictional elephant has lent its name to the Apache Hadoop project. Doug Cutting, whose technology is part of the groundwork for the Apache Software Foundation, said his inspiration for the word Hadoop came from the name of his kid’s stuffed yellow pachyderm.

A formal announcement is expected to come later today at the Hadoop Summit, being held in Santa Clara, Calif. One of the summit presenters, Eric Baldeschwieler, has already been named the chief executive officer of Hortonworks, according to a release that surfaced yesterday. Baldeschwieler, who is slated to talk about the past and the future of Hadoop at the summit, was formerly the vice president of software engineering for the Hadoop team at Yahoo. He will be joined at Hortonworks by about 25 others, including several Yahoo employees and core Hadoop architects and developers.

"We anticipate that within five years, more than half the world's data will be stored in Apache Hadoop,” Baldeschwieler said in a press release. “We've assembled a top-caliber team committed to the Apache open source community and with the technology and business expertise to deliver value to the big data market.”

Although an open source community of engineers and architects have had their hands in developing Hadoop, Yahoo has been its true pioneer and the largest contributor to the project, developing about 70% of the code. In return, Hadoop has been instrumental in managing Yahoo’s voluminous data, which runs on 42,000 servers and delivers content to nearly 700 million customers worldwide, according to the Hortonworks website.

“Apache Hadoop has been and will continue to be an important area of investment for Yahoo,” Jay Rossiter, senior vice president of Yahoo’s cloud platform group, said in a press release. “The creation of Hortonworks will enable Yahoo to leverage a commercial partnership in addition to our continued internal investment to accelerate the evolution of the technology and its use to power Yahoo’s business.”

Is it time for another look at Hadoop?

Yahoo's decision to spin off Hortonworks will add fuel to the emerging market for enterprise-grade Hadoop distributions, however, potential customers of the new company need to proceed cautiously, according to one industry analyst.

With Hortonworks, Yahoo has effectively become one of the biggest names in the Hadoop business, said James Kobielus, a senior data management analyst with Cambridge, Mass.-based Forrester Research Inc. But Yahoo also lacks the experience of a veteran commercial software vendor.

"Yahoo has been a Web 2.0 pure play from the start and now they're getting into the products business," Kobielus said. "Can Yahoo manage an actual software product group? They're unproven, so that remains to be seen."

End users can expect the market for enterprise-grade Hadoop to heat up in the coming months as new vendors like Hortonworks enter the market and established vendors like Cloudera, DataStax and MapR continue to develop and launch new products, Kobielus added.

What is ‘big data?’

“Big data” (also spelled Big Data) is a general term used to describe the voluminous amount of unstructured and semi-structured data a company creates -- data that would take too much time and cost too much money to load into a relational database for analysis. Although big data doesn't refer to any specific quantity, the term is often used when speaking about petabytes and exabytes of data.

In fact, just today Cloudera rolled out new tools for Cloudera Enterprise 3.5, which beefs up its data management offerings, and Cloudera SCM, claiming to provide easy installment and configuration for a complete Hadoop-based stack. MapR also announced new partnerships with several companies to help leverage big data analytics.

"We're going to see a glut of these kinds of vendors until there is the inevitable shakeout," he said. "For end users, this means that they need to take a renewed look at Hadoop for addressing problems that [are usually handled by] data warehouse and analytics vendors like Teradata and Oracle and IBM."

E-Chapter

0 comments

E-Mail

Username / Password

Password

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy