How Amazon re-invented Data Science at Amazon AWS re:Invent 2015?

Introduction

The ways of analyzing and visualizing data is changing, we must embrace this change.

Let me begin with a 8 line (quick) story.

King Horik, hired a civil engineer to build a city center in his kingdom. He specifically instructed the engineer for the urgency of this construction. After knowing all the details and specifications, the civil engineer promised to do so, but said, ‘It will take a year where I need to bring all my men for this work. But, it would cost you more’.

King realized, with all engineer’s men, the price to build this city center would cost him a fortune. He had already spent enough on storing food for his citizens. He got worried. He called his son, the wisest of all men in his kingdom. He overtook the responsibility of building city center.

He hired a consultant, and described their condition. The consultant offered him a price. He (consultant) said, ‘I would build the city center at my cost. But you’ll have pay for every citizen entering my city center. And, the deal was made.

Learning from the story: City Center is Data Science Applications. The consultant is Amazon. Rest is history.

Amazon re:Invent 2015

Amazon celebrated its 4th Annual re:Invent 2015 from 6th – 9 Oct’ at Las Vegas, USA. Had I been around Vegas at this time of the year, I would have definitely grabbed a seat at this festival.

Let me tell you what happened at this festival (in brief).

At re:Invent 2015, Amazon AWS showcased its years of hard work and dedication spent in building good incredible products. You name a technology /service, and amazon had a product to offer.

Amazon AWS has become the fastest growing Multi-Billion Dollar Enterprise IT Company in this world. The guy presenting is Andy Jassy, Senior Vice President, AWS (Proof: Image below).

Why am I telling you all this ?

Amazon has built great products and it has offered these products at incredible prices. But, why are you telling me all this? I bet you thought this for a moment! 😉

Well….Amazon has been the king of cloud computing and when they start rolling out products aimed for data scientists and business intelligence (BI) professionals, it is better that you watch rather than being caught off-guard!

Like always, the products Amazon has come out with aren’t made for just organizations, but for personal use as well. And they aim to disrupt a multi-billion dollar market with deep customer dis-satisfaction. Let me tell you how!

P.S – This article only highlights big data & analytics related knowledge from re:Invent 2015. I do not intend to promote any product or service here.

Data Science Products

Not only the process of collecting, uploading, storing and processing data on AWS has become faster, but addition of data-centric services has made this a comprehensive platform for a data scientist. Below are some products introduced at re:Invent 2015:

AWS IoT

IoT is no longer a dream. It’s right here.

AWS IoT enables a cloud service which allows an easy and secure connection with cars, factory floors, sensory grids, factory engines and almost anything which transmits data. Amazon has made sure that this service is great fit even for devices that have limited memory, processing power or battery life.

AWS IoT is build up of components such as things shadows, rules engine, message broker, SDKs, things registry whose sole job is to ensure the devices stay connected irrespective of bad connectivity, lack of storage and other unfavorable conditions.

The best part is, first 250,000 messages exchanged among these devices are free! ANd it is dirt cheap even after that.

Amazon QuickSight

Next surprise, Amazon has made the process of delivering Business Intelligence (BI) solutions faster (in seconds) with QuickSight.

For those of you, who know the BI industry, you would know how broken it is! Before the QlikView / Tableau became popular, a BI project remained gargantuan task. It would cost you a lot, people will be working on it for years and by the time some thing is ready, the customer requirement would have changed! After spending ton of money, human resource and time, the Organization would move to next vendor with a hope that they will deliver and similar story continues for most of the projects.

While solutions like QlikView and Tableau have already changed this to some extent, Amazon is setting things up for disruption. QuickSight, not only delivers faster BI solutions, but at 1/10th the cost of traditional available solution. Isn’t that amazing ?

This software is designed to work on many data types of collected using ad targeting, customer segmentation, forecasting & planning, marketing & sales analytics, inventory & shipment tracking and many more. In short, if you have data and you want its insights, switch to QuickSight. It will be available at $9 per user month for a year commitment and would be $18 for corporate users.

Here are the pricing plan:I am just waiting to give this a spin!

Amazon Kinesis Firehose

You didn’t like QuickSight? No problem!

Amazon’s Firehose provides the facility of integrating your existing BI tools and dashboards, thereby enabling real time analytics. It provides the easiest way to load streaming data into AWS.

This software allows you to spend more time focusing on your application and less time on your infrastructure. Since, Firehose, automatically takes care of monitoring, scaling and data management for users. It can also be used with other amazon services such as Lambda, Redshift, EMR, aimed at making this process of data management reliable and faster. Check this 2 minute video.

AWS Database Migration Service

As the name suggests, this service allows you to shift data to AWS easily and securely. Infact, while migration happens, the source data remains fully operational.

This service can be used with all forms of data from all types of widely used commercial and open databases. For example: You can migrate data from Oracle to Oracle, as well as Oracle Microsoft SQL Server. There is no limitation at data transferring even using heterogeneous data sources.

This includes facility such as Schema conversion tools and many other build in features which facilities successful transfer of data between databases.

AWS Lambda (Update)

If you are still unaware of Lambda, let me introduce it to you!

At times, you would have faced the difficulty of running code on server. Like your server would have failed to compute code successfully, or worse ( it crashed). Atleast I have!

Lambda provides you the opportunity to run code without thinking about servers. Incredible fact is, you only pay for the compute time – there is not charge when your code is not running. You just need to upload your code and sit back. This software takes care of everything required to run and scale the code with high availability.

This year, at re:Invent 2015 had amazon has announced Lambda’s extended support for Python function, increased function duration, function versioning & aliasing and many more.

Summing up, among all, I found these 5 products to be most useful to a data science / big data professionals. Whether you analyze or you visualize data, these product surely will make your life simpler and happier. In case, you feel intrigued to find about more products, you can always find them here.

6 Must Watch Videos for a Big Data Scientist

re:Invent wasn’t just limited to product showcasing, but extended to delivering knowledge on some of the most talked about topics in data science / big data industry. Here are the top 6 videos, I found useful for a data science professional. Remember, the ways of analyzing and visualizing data is changing, we must embrace this change.

1. Deep Learning – Going Beyond Machine Learning

2.Data Science & Best Practices for Apache Spark

3. Real-World Applications with Machine Learning

4. Big Data Architectural Patterns and Best Practices

5. Amazon Elastic Map Reduce

6. Your First Big Data Application on AWS

End Notes

With AWS re:Invent 2015, Amazon has not only launched / renewed its product line. It has challenged the traditional BI industry in the way it does best. It has also democratized IoT by launching its product.

Whether these highly competitive products end up becoming the kings of their own markets? – only time will tell. But, I’ll put my money on Amazon to do so.

With this article, my aim was to provide you the reserve of data science knowledge from re:Invent 2015. Hopefully, I’ve done justice to it. Finger crossed, though! While writing this last line, I’m curious to know which product you would like the most? Do share it in the comments section below!

Kunal is a post graduate from IIT Bombay in Aerospace Engineering. He has spent more than 10 years in field of Data Science. His work experience ranges from mature markets like UK to a developing market like India. During this period he has lead teams of various sizes and has worked on various tools like SAS, SPSS, Qlikview, R, Python and Matlab.

Did I understand correctly that QuickSight has a graphing option that updates from a live feed from the original data source? Has anyone worked with this and if so, what is your opinion on its efficacy and the complexity of setting it up? Thnx.