Saturday, April 5, 2014

Open Source Driving Innovation of Enterprise Hadoop

In the last seven months we have seen a tremendous level of innovation and maturity in the enterprise Hadoop platform. Hortonwork's HDP 2.0 and HDP 2.1releases are showing the tremendous innovation being driven by open source today. This innovation is significantly improving the enterprise capabilities of Hadoop and is changing the landscape of Hadoop. It is difficult for proprietary releases of Hadoop to compete with the hundreds of thousands of lines of code being written by the Hadoop open source community. Organizations ranging from Microsoft to Yahoo are adding their expertise and knowledge to the open source community. We are seeing proprietary and open source/proprietary solutions of Hadoop be put under tremendous pressure by the innovation of open source and seeing Hadoop distributions that are not 100% open source begin to disappear.

With HDP 2.0 and 2.1 there are a number of game changing capabilities added to Hadoop. These new releases have added comprehensive capabilities in areas such as scalability, multi-tenancy, performance, security, data lifecycle management, data governance, encryption, interactive query, high availability and fault tolerance. Key additions include:
HDP 2.0:

YARN - a distributed data operating system supporting applications with different run time. characteristics. YARN also adds scalability and improved fault tolerance to Hadoop.

NameNode High Availability.

Hadoop scalability to 10,000+ nodes.

New releases of Hadoop frameworks in key areas such as Hive and HBase.

HDP 2.1:

Interactive query capability in Hadoop. The Stinger project has increased the performance of interactive queries by 100 times with Hive optimization, container optimization, Tez integration and in-memory cache