Hadoop is a software platform for processing huge amounts of data. It consists of the Hadoop Distributed FileSystem which is capable of storing petabytes of data across thousands of nodes. HDFS ensures that data is always available, even if underlying nodes get corrupted or fail. Hadoop also includes Map/Reduce, a programming model for breaking the data into smaller chunks of work and distributing that work across the nodes in the cluster.