Cloudera Designing and Building Big Data Applications

Xebia's four-day course for designing and building big data applications
prepares you to analyze and solve real-world problems using Apache Hadoop and associated
tools in the enterprise data hub (EDH).
You will work through the entire process of designing and building solutions, including
ingesting data, determining the appropriate file format for storage, processing the stored
data, and presenting the results to the end-user in an easy-to-digest form. Go beyond
MapReduce to use additional elements of the EDH and develop converged applications
that are highly relevant to the business.

Take Your Knowledge to the Next Level and Solve Real-World Problems with Training for Hadoop and the Enterprise Data Hub

This course is best suited to developers, engineers, and architects who want to use Hadoop and related tools to solve real-world problems. Participants should have already attended Cloudera Developer Training for Apache Hadoop or have equivalent practical experience. Good knowledge of Java and basic familiarity with Linux are required. Experience with SQL is helpful.

CCP: Data Engineer Certification:

This course is an excellent place to start for people working towards the CCP: Data Engineer certification. Although further study is required before passing the exam (we recommend Developer Training for Spark and Hadoop II: Advanced Techniques), this course covers many of the subjects tested in the CCP: Data Engineer exam.

Learn more about the CCP Certification Exam here: http://www.cloudera.com/content/www/en-us/training/certification/ccp-data-engineer.html

Agenda

Introduction

Application Architecture

Defining and Using Data Sets

Using the Kite SDK Data Module

Importing Relational Data with Apache Sqoop

Capturing Data with Apache Flume

Developing Custom Flume Components

Managing Workflows with Apache Oozie

Processing Data Pipelines with Apache Crunch

Working with Tables in Apache Hive

Developing User-Defined Functions

Executing Interactive Queries with Impala

Understanding Cloudera Search

Indexing Data with Cloudera Search

Presenting Results to Users

Conclusion

Please note, that you need to bring your own laptop for this training.Â This laptop should meet the following requirements;

MinimumÂ RAMÂ required: 8GB

MinimumÂ Free Disk Space: 25GB

VMware Player 6.x or above (Windows)/VMware Fusion 6.x or above (Mac)

Student machinesÂ mustÂ have VT-x virtualization support enabled in the BIOS.

If the machines are running a 64-bit version of Windows, or Mac OS X on a Core Duo 2 processor or later, no other test is required. Otherwise, VMware provides a tool to check compatibility, which can be downloaded fromÂ http://tiny.cloudera.com/training2