Cloudera Expands Hadoop Management for the Enterprise

The Apache Hadoop project has generated a lot of hype as being the poster child for the phenomenon known as Big Data. The practical reality though is that Hadoop works best with a distribution of complementary tools and applications that fully enables an effective Big Data deployment.

One of the leading vendors in the commercial Hadoop space is Cloudera, which develops the Cloudera Distribution for Hadoop (CDH) distribution. Going a step beyond just having a Big Data deployment,Cloudera Enterprise is subscription-based management software plus support services for users of CDH.The new Cloudera Enterprise 3.7 release comes as vendors big and small jump on the Hadoop bandwagon as interest and investment continues to grow.

"Cloudera Manager is now in its third major release so we've had nearly two years of research and development in this product," Charles Zedlewski, VP of Product at Cloudera told InternetNews.com. "It can manage the complete Hadoop stack so it covers everything from HBase to Zookeeper to MapReduce and it manages the full operational lifecycle."

Zedlewski added that the new release includes a revamped look and feel with an updated user interface. A key part of the new user interface is something Cloudera calls, 'Universal Time Control.'

"We bring all the information into a single context, and then we give you one means of navigating through all of it, which is the notion of Universal Time Control," Zedlewski said. "You can see all the different Hadoop events and services together in one timeline."

The system also includes intelligent log management so the system will automatically scan all the different logs created by the various Hadoop systems. It will also pro-actively look for certain logs and events that might require an administrator's attention. The system can provide alerts on system events as well.

The Cloudera Enterprise 3.7 release is now also being expanded with a free edition of Cloudera Manager as well as a paid one. The Cloudera Manager Free Edition is a replacement for the company's SCM Express management solution.

"The free edition is designed to make it easier for people to get started with Hadoop," Zedlewski said. "You just download a single file, point it at an IP range, and it will build a complete production-grade Hadoop cluster in 30 minutes."

Zedlewski noted that the whole goal of the Free Edition is to help expand the base of Hadoop users. There are a number of key differences between the free and paid versions of Cloudera Manager. Among them is the fact that the free version is not able to automate Hadoop security features. The free version also lacks service monitoring and log search capabilities.

"There are definitely premium features that we have in the paid edition as well as the fact that there is no node limit," Zedlewski said.

While the Cloudera Enterprise management solution helps make a Hadoop deployment possible and manageable, there are some things it does not do. With the modern cloud approach, enterprise IT users are used to elastic scalability of services. Zedlewski noted that Hadoop compute loads typically aren't burstable.

"Compute is pretty easy to move around," Zedlewski said. "A Hadoop node is typically carrying anywhere from 1 to 10 Terabytes of data, so I can easily make a new node appear, but it takes time to replicate terabyte of data."

Aside from elasticity, Zedlewski noted that there are still a whole host of problems left to solve in the Big Data space. On the management side, the big issues are about dealing with the complexity of the Hadoop stack as more projects are continuously added. "We see Cloudera Manager as a way of helping companies to deal with the downside of complexity that comes from innovation," Zedlewski said. "There is also a whole lifecycle of updates and upgrades which needs to be much improved."