Section Study Resources

CDH4 update including MapReduce v2 (MRv2)We offer a great section on YARN in the following video: What’s New in CDH4? A Guide for Previous Attendees of Cloudera Administrator Training for Apache Hadoop

7. Data processing (6%)

Objectives

Analyze and determine the relationship of input keys to output keys in terms of both type and number, the sorting of keys, and the sorting of values.

Given sample input data, identify the number, type, and value of emitted keys and values from the Mappers as well as the emitted data from each Reducer and the number and contents of the output file(s).

Each project in the Hadoop ecosystem has at least one book devoted to it. The exam scope does not require deep knowledge of programming in Hive, Pig, Sqoop, Cloudera Manager, Flume, etc. rather how those projects contribute to an overall big data ecosystem.