Hadoop

Introduction to Hadoop & Big Data

Hadoop

Hadoop is an open-source framework that allows to gather and process BIG DATA in an allocated environment. Over groups of computers using simple programming models. It is designed to scale up from individual web servers to countless numbers of devices. Each providing local calculations and storage space.

BIG DATA

Big data is a set of large datasets that cannot be processed using conventional processing techniques. It is not only one method or a device; rather it provides many areas of business and technology.

What Comes Under Big Data?

 Social Media Data.

 Transportation Data.

 Search Engine Data.

 Power Grid Data.

 Stock Exchange data etc.

Big data contains vast quantity, great speed, and an extensible variety of data. The data in it will be of three types.

 Semi-Structured data: XML data.

 Unstructured data: Word, PDF, Written text, Press Logs.

 structured data: Relational data.

Big data Challenges

The significant challenges associated with big data are as follows:

 Catching data.

 Curation.

 Storage.

 Searching.

 Sharing.

 Transfer.

 Analysis.

 Presentation.

Hadoop Architecture

At its primary, Hadoop has two important levels namely:

(a)Processing/Computation part (MapReduce).

(b)Storage layer (Hadoop Distributed Data File System).

MapReduce

MapReduce is a development design made for handling considerable amounts of data. Similar by splitting the work into a set of separate projects. MapReduce programs are written in a particular style dependent efficient development constructs. Specifically idioms for handling details of data HDFS(Hadoop Distributed Data File System).

HDFS, the Hadoop Distributed Data File Program. It is an allocated file system meant to hold very considerable amounts of data (terabytes or even petabytes). It provide high-throughput access to this information. Data files are stored in a repetitive fashion across several machines to ensure their durability to failure. As well as accessibility to very similar programs.