Big Data & it's Emergence

Comments (0)

Transcript of Big Data & it's Emergence

By Juhi Tiwari Big Data & It's Emergence Understanding the "Big" in Big Data Emergence of Big Data What is Big Data???? “A massive volume of both structured and unstructured data that is so large that it's difficult to process with traditional database and software techniques.” Causes :1.Increasing impact of social & economic changes2.Increase in Scientific data.3.Increasing demand & supply of GPS enabled devices4.Fueled by Cloud Properties like :Economy of scale/extensibilty/affordability/agility Is Big data only Big in size ?? The answer is "NO"It also refers to the Velocity at which the data is growing !!It also refers to the Variety of data being generated from variety of sources !! Big Data in Private sector 1. Decoding the human genome originally took 10 years to process; now it can be achieved in one week.2.Cern experiments show 40 TB/sec generation of data.The Large Hadron Collider (LHC) experiments represent about 150 million sensors delivering data 40 million times per seconds Some More Stats to realize Big Data Introducing Volumes of Big Data In Our daily life Walmart handles more than 1 million customer transactions every hour.Facebook handles 40 billion photos from its user base & has more than a billion users since Oct 2012 of which 604 million from mobile devices.400 million tweets per day, 84 million users access twitter via mobile.The volume of business data worldwide, across all companies, doubles every 1.2 years approx. Some More Big data in Science AT&T transfers 30 PB data per dayGoogle processes 24 PB data per day 90 Trillion emails are sent per year Types of Big data Structured Data : Transactional, Time phased data.Unstructured Data : Data from Social media, channel's streamed data, customer service, etc.Sensor Data : Temperature, RFID, GPS.New Data Types :Video, Voice, Digital images. BIG DATA MOTIVATIONS Data Visualisation Techniques like dashboaring are given larger data sets to query upon to get the big picture. Virtualisation providing agility & extensibility of scale Reduction in costs of large storage devices like SSD & Flash devices. TECHNIQUES FOR Big Data Analytics Some technologies are being practiced & researched upon relating to Big data:1.Massively parallel-processing (MPP) databases, search-based applications, data-mining grids, distributed file systems like Hadoop (Cloudera, Apache Hive,Pig), distributed databases, cloud based infrastructure (applications, storage and computing resources) and the Internet2.Columnar database & compression, In memory databases (SAP HANA), Analytical appliances (oracle exadata,Netezza IBM, Teradata appliance), query optimizing techniques. Possible Effects produced by Big Data in various fields : Turn 12 terabytes of Tweets created each day into improved product sentiment analysisConvert 350 billion annual meter readings to better predict power consumptionScrutinize 5 million trade events created each day to identify potential fraudAnalyze 500 million daily call detail records in real-time to predict customer churn fasterMonitor 100’s of live video feeds from surveillance cameras to target points of interestExploit the 80% data growth in images, video and documents to improve customer satisfaction Limitations Of Big Data Majority solutions are delivered through appliance, as software or cloud basedData not only too big to process but also too big to transport.Its messy and involves high cost of data acquisition & cleaning.Privacy, interoperability challenges & imperfect algorithms are seen in developing countries.Developing technology, Less in practice. What does Big data do ?? Explore granular level details of business operations that could not be captured in Warehouses or reports. Answer questions that were previously considered beyond reach.Transport batch processing into real time processing enabling analysis of credit & market risks, identifying potential frauds in tax & finances. The Timeline