- What I'm going to use in this course…to work with Hadoop is the Cloudera Hadoop…sample virtual machine.…There are a couple considerations here.…In order to work with this,…you need to look at your operating system…and get the appropriate virtualization software.…Sometimes the virtualization software costs money,…sometimes it's free.…I'll be using a Windows machine with VirtualBox.…I've already downloaded the VirtualBox software…in preparation for setting up my VM.…The next thing is I'm going to get…the Hadoop virtual machine from Cloudera…for Windows for VirtualBox.…

And I'll show you what that looks like…or where you get it from in just a second.…That machine will include Cloudera tools and samples.…It's a developer edition.…What I'll be working with is the Enterprise edition…but for development.…It's important to understand that if you were to deploy that…you would have to pay Cloudera a fee…for some of the tools that are included…with the sample that you're working with.…And then in a later movie, I'll be showing you…

Resume Transcript Auto-Scroll

Author

Released

1/20/2015

Hadoop is indispensible when it comes to processing big data—as necessary to understanding your information as servers are to storing it. This course is your introduction to Hadoop, its file system (HDFS), its processing engine (MapReduce), and its many libraries and programming tools. Developer and big-data consultant Lynn Langit shows how to set up a Hadoop development environment, run and optimize MapReduce jobs, code basic queries with Hive and Pig, and build workflows to schedule jobs. Plus, get a sneak peek at some up-and-coming libraries like Impala and the lightning-fast Spark.