Turns out that The University of Maryland Department of Geography has already done a land cover classification project, which highlights urban areas very well! You’d think that would make me feel quite bad, after spending much of last week trying to do this BUT unfortunately their project used AVHRR (Advanced Very High Resolution Radiometer) data which has a larger resolution than LandSat and the data is older. But there is still an awful lot of analysis I can do with 20 year old 1km resolution data, if nothing else I can try to use some of their methods with the more up to date LandSat data I was using before.

So this is how to download, import into Qgis and use the AVHRR Land Cover Data

Firstly go to http://glcf.umiacs.umd.edu/data/landcover/ for some info on the whole project, click on the Download via Search and Preview Tool (ESDI) if you want to browse the World and select the data you want OR head straight to the FTP server. Choose your area, projection and resolution (I’m using North America, LatLong and 1km), eventually you will be presented with a bunch of files from which we download the file ending asc.gz as Qgis can handle this not the bsq.gz. Extract the file after downloading and boot up Qgis. Now hit ‘Add a Raster Layer’ and select the .asc file you downloaded and extracted, the file will be pretty big (NA is 1.39 GB) so expect a wait. Now you will have a greyscale image of your region, the brightness depends on the land cover classification calculated by UMD. Here are the code values which can be found on their website.

Code Values for 1km and 8km data

Value

Label

RGB Red

RGB Green

RGB Blue

0

Water

068

079

137

1

Evergreen Needleleaf Forest

001

100

000

2

Evergreen Broadleaf Forest

001

130

000

3

Deciduous Needleleaf Forest

151

191

071

4

Deciduous Broadleaf Forest

002

220

000

5

Mixed Forest

000

255

000

6

Woodland

146

174

047

7

Wooded Grassland

220

206

000

8

Closed Shrubland

255

173

000

9

Open Shrubland

255

251

195

10

Grassland

140

072

009

11

Cropland

247

165

255

12

Bare Ground

255

199

174

13

Urban and Built

000

255

255

To make the map in Qgis look at all interesting we need to apply a colormap algorithm. You can select your own by right clicking the file and selecting properties and changing the settings or go to properties select ‘Load Style’ and load the style I created which can be downloaded here (right click and ‘Save target as’). Now you will have a wonderful map showing land use in your region, the area highlighted in white is what I am particuarly interested in and worked on for much of last week. The next step is to see if I can work out how to generate the code values easily and apply it to LandSat data, then compare to OSM.

Urban areas are likely to emit more artificial light, right? Well thats the idea behind this side-project anyway. The Defense Meteorological Satellite Program (DMSP) monitors meteorological, oceanographic, and solar-terrestrial physics for the United States Department of Defense and I’ve found this pretty cool high-res (8mb) image of the World at night on Wikipedia. If I can obtain some even higher resolution images like the LandSat data then I’m sure I could do some nice analysis as with the LandSat data, to predict where we should have more information in OpenStreetMap. So this is almost a plea, if you find any more detailed data then please let me know!

So the previous method with the lovely flowchart and everything proved to be great for London but when I tried to use exactly the same method in New York it took far too much urban area away leaving us with a rather dull image when it should be very bright to indicate the NYC metropolis. Take a look at the difference between New York and London! So I essentially went back to the drawing board and thought, “okay what are the main features to a LandSat image?” answer: urban areas, rural areas and water! So that means I need to distinguish between urban and rural and then distinguish between urban and water, then somehow combine the two. Thats exactly what I did.

The method is very similar to before, except I’m now using both hue and saturation from the 742 false colour image instead of simply hue. The pretty flowchart for the process is shown below and for those you you who thought the last one was a mess wait ’til you see this one!

The method essentially works due to the different properties of hue and saturation. There is a great difference between the hue value for urban and rural areas, therefore hue is used to distinguish between rural and urban. Here is the hue component of the 742 image with a colour threshold to highlight urban areas (notice that that water is also heavily highlighted).Now we apply a colour threshold to the saturation layer and take advantage of the great difference in value for land and sea here. So now we have two layers one highlighting urban and sea and one highlighting urban and rural, we need to pick out the urban. This is done by using a clever layer mode algorithm in GIMP called ‘darken only’. This compares the pixel value in the two layers (hue and saturation) and displays the lower value pixel, in this way water and rural areas are removed from the image. Result!

The hypothesis I’m working on at the moment is that there should be more OpenStreetMap data (i.e. more nodes, more ways etc) in urban areas. To find out where these urban areas are I’m using freely available LandSat 7 data downloaded from http://www.landcover.org/data/landsat/. The data comes in the form of 8 different greyscale images corresponding to 8 different spectral bands ranging from visible blue light (0.45-0.52µm) to thermal IR (10.40-12.5 µm). Using bands 1,2,3 corresponding to blue, green and red a ‘true’ colour image of an area can be built, however this is not the best combination to use to highlight urban areas it turns out that the best combination is 7,4,2 for red, green and blue. Head over to my page on using LandSat imagery in GIMP to find a tutorial on all this.

What I’ve been working on over the past few days is how to use the false colour image I’ve produced using bands 7,4 and 2 to highlight the urban areas.Below is the 742 false colour image, and we can see that urban areas appear quite brown and purple, the aim is to extract that information and make everything else invisible. The problem is that some of the sea to the East is a similar sort of colour to the cities.

I decided the best way to extract the urban areas from the image above would be to decompose the image into hue, saturation and value. The hue is the most interesting component, giving a good contrast between rural and urban areas, but as we can see there is little difference in colour still between some sea areas and some cities which could prove a problem in extracting just urban areas.

The next step is to apply a colour threshold on this image to try to pick out only urban areas and black out everything else. After a great deal of playing around with filters the urban threshold here appears to be within light levels 196-212, after applying this filter the image below is obtained.

As we can see the urban areas are nicely highlighted in white and everything else is black. Now the aim is to compare the brightness of every pixel in this image with OSM data. We know the coordinates of the image above and we kno that each pixel is (30m x 30m) so this should be easy enough using a simple bounding box query of the OSM database. Lets hope that there is a relationship now between OSM data and the brightness of the pixels above. I’ll be back with any results when I have them and then hopefully be rolling this out across the world.