Three Techniques for Visualizing Data From Google BigQuery

This guest post comes from Ryan Boyd, developer advocate for BigQuery, helping businesses and developers get value out of their big data using Google's cloud data services (such as BigQuery). He has been at Google since 2006.

Google BigQuery is a REST-based API for SQL-like analysis of billions of rows of data in just a few seconds. Customers are using it to analyze advertising campaigns, web server logs, inventory availability and more. While BigQuery makes it easy for customers to gain insights into massive datasets, it can be challenging to make sense of the results without the power of visualization. The Google BigQuery browser tool doesn't have visualization built in, but the API enables you to integrate with other tools in just a few dozen lines of code.

Let's explore the US Birth Statistics dataset, which Google has made available as sample data in BigQuery. It's only 137 million records, but provides insights into things like baby weights, age of mother and more for children born between 1969 and 2008. I've also mashed up this data with 2008 election results so we can look at some of the data in light of red state/blue state status.

Using BigQuery with Google Apps Script and Google Spreadsheets

Google Spreadsheets and the included charting capabilities is a really powerful way to visualize data from BigQuery, using a tool that even non-developers understand. In this case, I wanted to understand whether there was a visible difference between the age of mothers when bearing their first child in red states (Republican-leaning) and blue states (Democrat-leaning).

In about 50 lines of code in Google Apps Script (server-side JavaScript), I’m able to send an authorized query off to BigQuery and pull the results into the spreadsheet above. I then created a pivot table and a graph based on that data. If I load more data (maybe another year) and want to query again, I can simply hit the ‘Run Query, Run!’ button above and the sheet, pivot table and graph automatically update.

The query, across 137 million rows of data, took only 2.9 seconds. You can try the query yourself against this open public data set:

SELECT state, AVG(mother_age) AS avg_age
FROM [publicdata:samples.natality]
WHERE year=2000 AND ever_born=1
GROUP BY state
ORDER BY avg_age DESC;

Using BigQuery with Commercial Visualization Tools

Are you reading Programmable Web, but don’t like to write code? There are still some visualization solutions available. Several commercial visualization apps have integrated with BigQuery to make it easy for analysts and execs to stay up to date. Bime and Qlikview have both built live demos using the US Birth Statistics data.

The simple REST API BigQuery provides to analyze massive datasets makes it easy to build many different types of visualizations. Try it out with public datasets without getting out your credit card, and drop me a note on Google+ if you build something exciting!