Deployment of Hadoop-based Services on Windows and on Windows Azure

Deployment of Hadoop-based Services on Windows and on Windows Azure

Apache™ Hadoop™ is an open source framework from
Apache. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. It is very useful for analyzing and developing relationships for large unstructured datasets. Data processing in Hadoop is distributed
across a cluster of computers using a simple programming model. For a complete reference on Hadoop, see
hadoop.apache.org.

The core Hadoop project contains the following components:

Hadoop Common is the common utilities that support other Hadoop related subprojects.

Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop applications.

Hadoop MapReduce is the software framework for distributed processing of large unstructured datasets across a Hadoop cluster of computers.

Hadoop-based services for Microsoft Windows includes the core components and the following Hadoop related projects: