Sunday, 18 November 2012

How to minify GeoJSON files?

You can't do web mapping these days without knowing your GeoJSON. It's the vector format of choice among popular mapping libraries like Leaflet, D3.js and Polymaps. Size matters on the web, especially if you want to distribute complex geometries, like the world's countries. The challenge is even bigger if you want to target mobile users - or support web browsers with poor vector handling (IE < 9). This blog post will show you how to minify your GeoJSON files before sending them over the wire.

The first thing you should do is to generalize your vectors so they don't contain more detail than you need. In a previous blog post, I was able to remove 90% of the coordinates without loosing to much detail for map scale I wanted to use. This will of course have a great effect on the file size.

Today, I'm going to use country borders from the Natural Earth dataset. These datasets are already generalized for different scales (1:10m, 1:50m, and 1:110 million), so I'll use them as they are. The 1:110m (small scale) and 1:50m (medium scale) shapefiles will cover the needs for the thematic world maps I plan to make:

Let's open the datasets in QGIS. If you look at the attribute table you'll see that each dataset contains 63 attributes, which makes them very versatile. For your web maps, you probably need just a few of the attributes, and you should remove the ones you don't need. I'm keeping the country name and the ISO 3166-1 country codes (alpha-2, alpha-3, and numeric), which can be used to link country geometries to statistical data.

The important thing is that I'm only keeping one decimal (coordinate precision) for the 110m dataset, and two decimals for the 50m dataset, which is sufficient for my map scales. This will reduce the size of the GeoJSON files by more than half. The size of the 110m GeoJSON is now 207 kB and the 50m version is 1,897 kB. But we can do better.

This will reduce the file size of the 110m GeoJSON from 207 to 156 kB, without loosing any data quality. More than 400k of whitespace characters was removed from the 50m GeoJSON file, reducing the file size from 1,897 to 1,481 kB.

If your web server is supporting gzipping on-the-fly, the 110m GeoJSON will end up being 45 kB and the 50m version will be 430 kB. Not bad!

NB! Mike Bostock’s TopoJSON would allow us to compress the GeoJSON even more, while preserving topology (shared borders between countries) - but we would need to use a map client supporting the format. Looks promising!

9 comments:

Using the regular expressions like that can easily break your labels or attributes. I would suggest to use a json parser that supports minification to remove white space.

You can further minify the GeoJSON by - removing invisible geometries, if the simplification process did not already.- reducing the output precision of float coordinates according to the desired zoom level. - use shorter ids for all attributes.

I've been playing with ogr2ogr to convert shapefiles to GeoJSON and I've used the -simplify option to reduce file size. Looking at the ogr2ogr reference I see the -lco option you've used, but where does the COORDINATE_PRECISION come from? Is there another reference I can use?

Hi, thanks for the useful article, just wanted to say that I released a very simple javascript page for automatically removing attributes and whitespace from GeoJSON files.It takes an input GeoJSON and removes every attribute except the country IDs and names.You can find it here on gitHub.https://github.com/Pimentoso/GeoJSON-Attribute-Cleaner