Video Description

Understanding of data Processing with various schemas like structured unstructured and semi structured data.

Learn data movement from various sources like RDBMS, Web log server, Syslog server, social media and other sources.

In Detail

Hadoop which is one of the best open-source software frameworks for distributed computing. It provides you with means to ramp up your career and skills. You will start out by learning the basics of Hadoop, including its file system HDFS, and its cluster management resource YARN and its many libraries and programming tools. This course will get you started with the Hadoop major components which Industry demands. You will be able to see how the structure, unstructured and semi structured data can be processed with Hadoop.

This course will majorly focus on the problem faced in Big Data and the solution offered by respective Hadoop component. You will learn to use different components and tools such as Mapreduce to process raw data and will learn how tools such as Hive and Pig aids in this process. You will then move on to Data Analysis techniques with Hadoop using tools such as Hive and will learn to apply them in a real world Big Data Application. This course will teach you to perform real-time data analytics, stream and batch processing on your application. Finally, this course will also teach you how to extend your analytics solutions to the cloud.

Downloading the example code for this course: You can download the example code files for all Packt video courses you have purchased from your account at http://www.PacktPub.com. If you purchased this course elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.