GHCN-v2 Data

My motivation was twofold: First, I wanted to see the range of temperatures by location. Second, I wanted to see data availability over time at each location. The usual average global temperature anomaly graphs do not provide this information.

To this end, I wrote some quick and dirty Perl scripts which produce the 13,736 files consisting of HTML pages and PNG time series graphs you can find on climate.unur.com. You can find links to the data I used on that page as well.

To plot the data, I first sliced and diced the large monolithic data sets using the script below:

This is a fairly simple script which creates a file for each unique station (consisting of WMO station identifier, modifier and duplicate number). These files have three columns: Year, month and temperature. For purposes of illustration, running this on v2.mean results in 13,471 files

By the way, I decided to use the filesystem as the database as this cut down the time taken by each script to the order of minutes rather than hours compared with an earlier version which used SQLite.

Next, I generated a single series for each WMO location. When there were multiple series available for a given WMO location, I averaged the non-missing observations by month. Here is the source code for that script:

After running this script, I had 4,498 files for monthly mean files by WMO location and 4,407 adjusted monthly mean files by WMO location. Each file contains two columns: A year-month label and a combined temperature observation. This is not the correct way to do it for statistical analysis because each observation ends up being an average of potentially different numbers of observations. However, it was sufficient for my visualization-only purpose.