THE HEAT IS ON! A Simple Guide to Creating Heatmaps

Heatmaps and choropleths can give your data a sizzling new look with easy-to-understand visualizations. Here’s how.

Never before has the world been flooded with so much data. Everyone is collecting, analyzing, and mashing up Big Data, from the federal government to the smallest mom-and-pop small business. The biggest challenge now is to make sense of that data, and sometimes a powerful visualization is just the thing.

When it comes to geographic data, heatmaps (and their first cousins, choropleths) have long been a favorite visualization strategy. They let information consumers get an intuitive feeling for where the biggest concentrations of an item are, or where there are voids. And, it turns out, a couple of freely available tools let you whip out heatmaps in no time at all.

At this point, you may be saying, “I’ve heard of heatmaps, but what the heck is a choropleth?” The best way to illustrate the difference is probably by example. Let’s say you had data on 24,000 My Little Pony: Friendship is Magic fans (aka Bronies). One way you can visualize that data would be by coloring each country to indicate how many fans lived there.

This is a choropleth. In it, geographic regions are filled in with a color that represents a value. One classic example everyone has seen is the “red states and blue states” election maps. Choropleths are good when your data is associated with fairly large geographic regions (states, countries, etc.).

Heatmaps, conversely, are better when you have a lot of data points that represent small areas (or points). In addition to knowing what country all those Bronies came from, I also knew what zip code each of the U.S. fans lived in. Zip codes are in fact areas with boundaries, but when looked at from a national level they are so small they might as well be points. You want a feeling for where the greatest concentrations are, not a precise value at any one position. That’s what a heatmap is good for.

So, how can you produce these powerful visualizations? There are two good free tools: OpenHeatMap and the Google Maps JavaScript API. Each has its own strengths and weaknesses, but between the two of them you can generate just about any visualization you could want.

Let’s start with OpenHeatMap, since it’s the easiest to use. Begin by preparing your data in an spreadsheet (OpenHeatMap supports both native Excel and CSV formats). The spreadsheet should have two columns: the location and the value. For example, here’s the top of the spreadsheet that generated the zip code heatmap.

Once you have your spreadsheet, go to www.openheatmap.com and click on “Create your map.” A popup asks you to upload your file (or you can use a spreadsheet in Google Docs). After processing, you are given the opportunity to customize the map. You do want to do this, because the defaults don’t always look that good. For example, here’s the default view for the zip code data.

To make the map you saw, I began by changing the color scheme to red, then picked blobs instead of circles and drastically reduced the size of each blob. You can also change the default center and zoom, and give the map a title and change the legend. Once you’re happy, click “Save & View,” and you are presented with your final map.

If you want, you can just take a screen capture at that point, but the real power is in linking to the image, since the URL is persistent. For example, if you click here, you’re taken to the live version of that zipcode map, and can pan and zoom the map, or mouse over a point to learn more.

OpenHeatMap supports a plethora of geographic zones (here’s the full list), but it does have a few limitations. The biggest one I discovered with the Brony dataset is that you are limited to only three color stops in the heatmap, and the values are distributed linearly along the spectrum. This works fine if your data lies on a normal distribution, but not so much if it’s skewed. In the zip code example, almost all the zip codes had fewer than 4 bronies, but in the country example, the United States has as many Bronies as the rest of the world combined. If you use OpenHeatMap with the data, you end up with something like this:

If you compare this to the example at the beginning of the article, you can see that much of the detail has been washed away into bland beige. That’s because – with the exception of the United States with over 13,000 Bronies – all the other countries have fewer than 1,000. There’s another big drop-off between countries with 500-1,000 Bronies and ones under 100. What you really want is a logarithmic color scale that can show the detail at the low end. Unfortunately, OpenHeatMap doesn’t support that.

Luckily, the Google Maps API does. But there’s a tradeoff. There’s no snazzy Web interface for the Google tool; you need to write a webpage using JavaScript to create the map. Documenting the full API is beyond the scope of this article (it’s massive), but here’s two quick examples of how you can use it.

Let’s redo the zip code heatmap to begin. To use the Google heatmap, you need latitude and longitude, not zip code. I grabbed a zip code to coordinate database off the Web (there are a ton of them; here’s the one I used.) Then I used a little Excel-Fu to add latitude and longitude data to each line in the spreadsheet (hint: VLOOKUP). Finally, I fired up Emacs and recorded a simple editor macro to convert the data into JavaScript code. What I ended up with was this:

If you really want to start digging into the API, you can use it to geocode street addresses to latitude and longitude on the fly.

There’s a similar API for doing choropleths. It’s a bit more limited that what OpenHeatMap can do, because it doesn’t support a lot of different region types (it’s pretty much limited to countries, state/provinces, and area codes). Notably, the API doesn’t support U.S. counties. But what it does allow is fine-grained control of the color gradient. Here’s the code that generated the map shown at the top of the article.

The important bit here is the colorAxis parameter. Using it, I could specify a logarithmic gradient rather than a linear one.

By mapping your data, you can look at it from an entirely new perspective, one that brings geography and community into the equation. Between OpenHeatMap and the Google APIs, you should be able to generate any heatmap or choropleth you need.

Comments

Google heatmaps can really be useful if you do it right. I would like to share another instance where we used heat maps for one of our clients. We integarted the Heat Map API to display locations for the most searched store. I would like to share this with you and your readers and would appreciate any feedback on it. http://wisdmlabs.com/blog/add-google-heat-maps-to-your-wordpress-website/

We’re working on a service that allows users to visually plot coordinates, addresses, and other geographical data on interactive heatmaps: http://simpleheatmap.com
It uses Google Maps API. No choropleth support, but it does offer about 10x more free data points than competitors.