Cloudera Launches New Data Management, Hadoop Versions

Used together, Cloudera Enterprise 4.0 and CDH4 form an end-to-end package that enables enterprises to integrate Hadoop into their existing enterprise data management systems for any business application.

Apache Hadoop software and services provider Cloudera, which ranks among industry leaders in showing enterprises how to use new-generation batch analytics, on June 5 launched version 4.0 of Cloudera Enterprise, its flagship data management platform.

The Palo Alto, Calif.-based company, whose chief architect, Doug Cutting, led the original Hadoop development team at Yahoo several years ago, also released a new version of its own Hadoop distribution.

The new management edition combines the company's Cloudera Manager software with around-the-clock expert technical support to deliver a turnkey-type system for deploying and managing Hadoop analytics in production environments.

Not Normally a 'Turnkey' Deployment

The terms "turnkey" and "Hadoop" are not often included in the same sentence, so this is significant news for IT administrators.

Hadoop by itself is notoriously tricky to deploy and use by a line-of-business employeeeven for many experienced IT administrators. But new front ends produced by Cloudera and other vendors have made the popular open-source batch analytics engine much easier and more intuitive to use.

Cloudera's Distribution including Apache Hadoop, version 4, came out June 5, following the completion of a rigorous beta program that combined testing and feedback from its enterprise customers and partner ecosystem, the contributions of Cloudera's engineering team, and the global Apache open-source community.

Used in tandem, Cloudera Enterprise 4.0 and CDH4 form an end-to-end package that enables enterprises to integrate Hadoop into their existing enterprise data management systems for any business application, Cloudera said. New features in the package include high availability and automation for management of large-scale Hadoop clusters.

CDH4's new advanced, enterprise-grade features, according to Cloudera, include:

High availability: Increased usability for mission-critical use cases and applications with a highly available NameNode that eliminates the only remaining single point of failure in HDFS. Heterogeneous clusters minimize downtime and enable users to run different nodes on different versions of Hadoop.

Improved security: Allows for more sensitive data to be stored in CDH with more granular access control to support multi-tenancy. HBase table and column permissions secure which users and groups have access to HBase columns and tables.

Improved extensibility: Helps solve a broader range of scenarios through coprocessors that enable more sophisticated applications in real time and open resource management (aka MR2) that allows for multiple data processing frameworks to run on the same Hadoop cluster, inevitably saving costs on storage.

Other new features from the Hadoop stack: Common compression codec (Snappy), common file format (Apache Avro), REST over HTTP access to HDFS, Web shell (for Apache Pig and Apache HBase), slot-less resource manager, and faster and easier user Web access to Hadoop systems.

Two-year-old Cloudera, based in Palo Alto, Calif., hasn't been shy about making partnerships with larger companies, having agreements in place to handle batch analytics deployments for customers of EMC, Dell and Oracle, to name three.

Chris Preimesberger was named Editor-in-Chief of Features & Analysis at eWEEK in November 2011. Previously he served eWEEK as Senior Writer, covering a range of IT sectors that include data center systems, cloud computing, storage, virtualization, green IT, e-discovery and IT governance. His blog, Storage Station, is considered a go-to information source. Chris won a national Folio Award for magazine writing in November 2011 for a cover story on Salesforce.com and CEO-founder Marc Benioff, and he has served as a judge for the SIIA Codie Awards since 2005. In previous IT journalism, Chris was a founding editor of both IT Manager's Journal and DevX.com and was managing editor of Software Development magazine. His diverse resume also includes: sportswriter for the Los Angeles Daily News, covering NCAA and NBA basketball, television critic for the Palo Alto Times Tribune, and Sports Information Director at Stanford University. He has served as a correspondent for The Associated Press, covering Stanford and NCAA tournament basketball, since 1983. He has covered a number of major events, including the 1984 Democratic National Convention, a Presidential press conference at the White House in 1993, the Emmy Awards (three times), two Rose Bowls, the Fiesta Bowl, several NCAA men's and women's basketball tournaments, a Formula One Grand Prix auto race, a heavyweight boxing championship bout (Ali vs. Spinks, 1978), and the 1985 Super Bowl. A 1975 graduate of Pepperdine University in Malibu, Calif., Chris has won more than a dozen regional and national awards for his work. He and his wife, Rebecca, have four children and reside in Redwood City, Calif.Follow on Twitter: editingwhiz