Processing compressed OpenStreetMap Data with Java

by Pascal Neis - Published: October 9th, 2017

This blog post contains a summary on how you can write your own Java classes to process OpenStreetMap (OSM) pbf files. PBF is a compression format, which is nowadays more or less the standard utilized for reading and writing OSM data quickly. In the OSM world, many tools and programs implemented this file format (you can find additional information here). However, I think the following samples for reading and writing such compressed OSM data can be very helpful. In particular, if someone has to create some sort of test data or has to read some specific mapped objects of interest for her/his own project. The well-known Java Osmosis tool (command line application for processing OSM data), provides several libraries that are the basis for this brief tutorial.

Step 1: Maven is the key – If your Java project is already managed by Maven, you can just add the following lines to your pom.xml. It downloads and adds the required jar-files that are needed to process compressed OSM data to your project.

If you don’t use Maven, you can download the aforementioned dependencies here and add the jars as described here to your eclipse build path.

Step 2: Implementing the Sink interface for reading OSM data – Now, after you added the required libraries to your project, you can create your own class to read compressed OSM data. The following MyOSMReader class is an easy example for that scenario: It reads an OSM pbf files and prints all Ids of ‘ways’ with a ‘highway’ key.

Step 3: Implementing the Source Interface for writing OSM data – Similarly to Step 1 to read an OSM file, you will have to implement again an interface to write data as well. This time the Osmosis core task Source interface is being utilized. The following example class writes 10 Nodes to a new OSM pbf file.

Step 4: That’s it – You should now be able to write your own classes that allow you to process compressed OSM data.

One last thing: When writing your own Java classes to process OSM data, you should always keep in mind, that the PBF file has it’s own object ordering format of the OSM objects: Each file first contains Nodes, then Ways and at the end Relations. That means, a Way element only contains references to Node ids. It doesn’t contain the coordinates or any tags of the actual nodes that are used for the Way. Thus, if you want a complete Way element with its geometry, you have two options: Store all Nodes in some kind of map or read the entire OSM file twice. Furthermore, if you’re interested in additional compressing options during writing a compressed OSM file, you should have a look at this class.