February 23, 2011

Population Growth and UHI – Another look using the NOAA population data

Update (2-24): I realized I should explicitly state that all charts below are for the 1970-2000 period.

A few months ago I plugged NOAA population data into my pairwise comparison of stations that I had previously performed with the NGHIS census data, looking for a UHI signal. One of the questions I’ve had on the backburner since then is why this NOAA population data did not seem to show a UHI signal (particularly in the TOB dataset) comparable to that of the NHGIS data for the 1970-2000 time period. It seemed that the NOAA data, given that it’s presented in 1 km cells, should better represent the population at a station than the “Place” designation I used for my NGHIS station assignments. This is some further analysis into the issue. I’m using the USHCNv2 TOB avg dataset and my intermediate data and code can be retrieved from: http://dl.dropbox.com/u/9160367/Climate/2-23-11PostUHI.zip

First of all, I’ll note that because I’ve been updating my comparison method since that original post, this analysis now restricts the difference in station change years from CRS to MMTS to 5 years, which does seem to eliminate some of the noise.

Even with this restriction, I still do not get much of anything in the way of a signal:

However, one thing I noticed when doing a different analysis was that quite a few stations which experienced the largest increase in LOG population growth tended to be the most rural (and those stations with the lowest absolute population growth). It’s been suggested that the most UHI might occur at the early stages of urbanization, but even if this were the case, I was skeptical that 1) this pristine->slightly urbanized transition could accurately be reflected using population as a proxy (as opposed to land changes), and 2) that the NOAA is precise enough to assume that the data going from 0.25 to 0.75 persons per km should have the same weight as going from 10 to 30 persons per km. Furthermore, the main difference between the NGHIS and NOAA data was that NHGIS gave absolute population numbers for a place, whereas the NOAA gives population per km and thus much lower numbers (whose differences will be amplified using log diffs). As a quick fix, then, for population I decided to use LOG(popDensity+X) instead of simply LOG(popDensity), where I varied X. Here are some results:

X=10

X=25

X=50

This is far from optimized, but you can see in each scenario we seem to get a much stronger signal. This seems to suggest that we have to put a lower limit on when this “log population” relationship to UHI exists, or at least when it is reflected in the data.

Another area of investigation is related to the fact that NHGIS “place” reflects a larger area than the 1 km radius we’ve used for NOAA population data. Expanding the radius to include for a station might increase our signal in that 1) it is quite possible that an increase of population more than 1 km would still affect that station, and 2) it does not require the same locational precision from the NOAA population data.

Population radius=5km (X=25)

Population radius=10km (X=25)

I think this further investigation suggests that indeed, the NOAA data seems to show an even stronger signal than my NHGIS population runs.

In some future posts I should look into different time periods (after all, the NOAA pop data goes back to 1930), as well as examining pairs specifically that are in the same “group” of absolute population density (<10, <100, >100, etc). Perhaps there is more of a linear relationship when going from 0 to 25 persons/km rather than a log relationship.