Big Data, R and HANA: Analyze 200 Million Data Points and Later Visualize Using Google Maps

For this fun exercise, I analyzed more than 200 million data points using SAP HANA and R and then brought in the aggregated results in HTML5 using D3, JSON and Google Maps APIs. The 2008 airlines data is from the data expo and I have been using this entire data set (123 million rows and 29 columns) for quite sometime. See my other blogs

The results look beautiful:

Each airport icon is clickable and when clicked displays an info-window describing the key stats for the selected airport:

I then used D3 to display the aggregated result set in the modal window (light box):

D3 made it looks ridiculously simpler to generate a table from a JSON file.

Unfortunately, I can’t provide the live example due to the restrictions put in by Google Maps APIs and I am approaching my free API limits.

Fun fact: The Atlanta airport was the largest airport in 2008 on many dimensions: Total Flights Departed, Total Miles Flew, Total Destinations. It also experienced lower average departure delay in 2008 than Chicago O’Hare. I always thought Chicago O’Hare is the largest US airport.

As always, I just needed 6 lines of R code including two lines of code to write data in JSON and CSV files: