Excerpts of this article are copyright there respective owners: The original article can be found at : Hortonworks

Apache Hadoop worker node hardware @ Yahoo!, a lot of nodes with 6*2TB SATA drives, 24GB RAM and 8 cores in a dual socket configuration. This has proven to be a pretty good configuration. This year, I’ve seen systems with 12*2TB SATA drives, 48GB RAM and 8 cores in a dual socket configurations. We will see a move to 3TB drives this year.

Setting up Apache Hadoop on RHEL6/CentOS 6 is simple wiith the recent availability of RPMs for Apache Hadoop it makes it much simpler to setup a basic Hadoop cluster. This will allow you to focus on how to use the features instead of having to learn how they were implemented.

These instructions DO NOT Hadoop settings to make Hadoop fast but it will get you running a Hadoop culster fast. We will leave Hadoop optimization for another day.