Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS). MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located.

Hadoop has been demonstrated on clusters with 2000 nodes. The current design target is 10,000 node clusters.

I followed the Quickstart guide and I can confirm that it works on [en:Mac OS X] too, but I managed only to make it run in "standalone" mode: usefull for first-stage development and debugging.
<!--more-->

To understand a bit more how it work, I decided to do also the Map-Reduce Tutorial: I took the code of the example Word Counter (v1.0) and I wrote a Character Counter: same code but with one for more and more internal documentation.

Assumed it can't help you without a minimum of study of what MapReduce is, I would like to share the code of what I did: