There are a lot of Wikipedia visualizations. Some concentrate on article
contents, others on the links between articles and some use the geocoded
content (like in my previous blog post).

This new visualization is novel because it uses the geographical content
of Wikipedia in conjunction with the links between articles. In other
words, if a geocoded article (that is, an article associated with a
location like a city) links to another geocoded article, a line will be
drawn between these two points. The result can be found on the map on
the left.

Read on for zoomed views, slideshows, browsable maps, etc.

Methodology

Scroll down to see the slideshows, pretty pictures and interactive maps.

The first thing I had to do was to extract the geographical data
included in the articles and the links between the articles. Instead of
parsing the very complicated Wikipedia markup, I chose to use the good
work done by the folks at GeoNames. In
the download section, there a SQL file with the name of every geocoded
Wikipedia article. Then, I
downloaded
all English articles in Wikipedia (9GB compressed, about 40GB
uncompressed) and used a bit of Regex magic to extract reentrant links
(that is, hyperlinks that link to geocoded articles). After these steps,
I was left with two datasets: a list of all geocoded articles and a list
of all links between articles.

To draw the map, I used the same technology I developed for my map of
scientific
collaborations.
I had to adjust the tool to add features like other geographical
projections (the Mercator projection, while simple, makes Greenland
seems as large as Africa), linear transformations, etc. The datasets
computed in the previous steps were then parsed and drawn by my mapping
tool. I then played with the colors in Photoshop to convert the
outputted grayscale map to color. To build the browsable and overlay
maps, I used the fantastic MapTiller
tool. By the way, the input projection for this tool is Equidistant
Cylindrical - knowing this would have saved me a lot of time!

Slideshow

[huge_it_gallery id="4"]

This slideshow contains zoomed parts of the map of different countries,
continents and regions. Click on a picture to enlarge it. Browse to the
bottom of this blog post to download the full size map (200M pixels -
18MB JPEG file).

Browsable Map

This map is projected using a Robinson projection; it is a "compromise"
projection meaning that while it doesn't resolve all the problems found
in many projections, it minimizes most distortions.

Data & High Resolution Files

There's also a high resolution file, but Amazon was charging me a pretty
penny to host and serve it, so I removed it. Let me know if you want it;
I'll send you the file. I also have a 1.7G pixels file, but it is too
large to host here, so let me know if you need it. It uses the
Equidistant Cylindrical projection, not the Robinson one like the other
high resolution file.

The input dataset (30MB compressed, around 95MB uncompressed) can be
downloaded here, the fields
should be self-explanatory.

The drawing tool will be eventually open sourced, but I need time to
clean it up.