Job Preparation Guide

Which platforms are used for large scale cloud computing?

There are two platforms that are primarily used for scale cloud computing: MapReduce and Apache Hadoop.

MapReduce – is a software by Google. It supports distributed computing and works on large set of data. It distributes the data to several other computers known as clusters by using cloud resources. It has the capability to deal with both structured and non-structured data.

Apache Hadoop – is an open source distributed computing platform written in Java. It creates a pool of computers that use the hadoop file system. It then clusters the data elements and applies the hash algorithms that are similar, and then creates copy of the files that already exist.