Monthly Archives: April 2012

Three weeks ago we mentioned we were still perfecting the ‘redaction bot’. The piece of code that goes through and redacts (removes/hides) any data that isn’t compatible with the new licence. Much to our dislike, it will take more time to get the bot working perfectly. Good news: the bot is now passing more tests than ever; bad news: still not all. Several people are working on this to make it work error-free.

Once these system tests are passing, live data testing will be conducted against a test server that is already configured and waiting. Subject to a successful test, a test of an isolated portion of the live database will be processed, most likely for the island of Ireland. If this goes successfully, the rest of the data will be processed.

On a positive note: in the last few weeks we’ve also managed to get agreement on several contributions to keep them in our database. We would like to thank all people who helped us make that happen.

A summary of all the things happening in the OpenStreetMap (OSM) world.

The OSM database is back in read and write mode since last Thursday. Thanks to the Admins for their incredible work to move everything to the new server! The changes that will be made to the dataset due to the license change will take place in the coming days and will be running in the background. Also, you can find the last CC-BY-SA OSM planet file here.

“All I Want for OpenStreetMap is …“? Some thoughts and wishes from Mikel and Kate for OSM.

The organization Development Seed would like to create some new contribution tools for OSM. You can help here!

With the new server successfully installed by our sysadmin team, we’re now onto the second part of our migration – the data ‘redaction’ work required to move to the Open Database License. We promised our first progress report next week, but lots of people have been asking, so here’s an update four days early.

The code changes to the OpenStreetMap API have been completed and successfully reviewed. openstreetmap.org is therefore ready to distribute the new data. (Thanks to Matt Amos for the code and Tom Hughes for the review work.)

The next part is the ‘redaction bot’. This is the piece of code that, for an area of OpenStreetMap data, goes through and redacts (removes/hides) any data that isn’t compatible with the new licence. This is the most crucial part of the whole process: we aim not to retain data whose creators haven’t given permission for it to be distributed under the Open Database License, and conversely, not to inadvertently delete anything from the vast majority which is compatible.

Since Wednesday we’ve been running tests against real-world data (thanks to Frederik Ramm for help with this). We’re not yet 100% happy with the results, so we are continuing to work on the code. As you would expect, we will not set the bot running until we are absolutely confident that it is producing accurate results. With the four-day Easter weekend just beginning, we currently expect that this will be next week. This puts us a few days behind schedule, but we owe it to our mappers to get this right.

If you’re a developer, you can help fix the currently failing tests: check out the code at https://github.com/zerebubuth/openstreetmap-license-change. If you’re a mapper, this gives you a few more days to get your area shipshape! And if you’re a data consumer, you can, of course, continue to use the data under our existing license, CC-BY-SA 2.0.

We’ll have a further update next week and, in any case, before the bot starts running.

The sysadmin team completed the data base migration to the new DB server on schedule during the the morning of 04 April 2012. The API is now back to normal, Read – Write operation. Now the final steps of the license upgrade will proceed as outlined in the March – April service schedule announcement

Other items of possible interest as the license upgrade process proceeds:

osm.org map tile generation will recommence within the next few hours.

Replication diffs during the license upgrade period have started after community requests. These cc-by-sa data replication diffs are found in the redaction-period directory on planet. planet.openstreetmap.org/redaction-period. These diffs will only serve the period up until the switch to the new license. Mappers have requested these diffs for the redaction period. General consumers of OSM data may choose to consume these diffs or not at their discretion.

ODbL diffs will be located in another directory to be announced in future.

During the redaction period it is recommended that editors save their work early and often to reduce the chances of, and the complexity of conflicts with the back ground redaction process.

OpenStreetMap contributors have used track files from their GPSr devices for years while improving OSM data. They have shared those track files and the track points have been available to other mappers via editors and the web site. Now we are providing a way for you to get all of those points at once.

This is the collected GPS point data from the first seven and a half years of OpenStreetMap. It is a very large collection of points and it is very raw data.

the compressed file is 7GBytes in size

uncompressed, the file is a 55GByte text file

the data consists of coordinate pairs only, with no track file or meta data

points were contributed by thousands of users

points were contributed as thousands of distinct track files

the data includes 2,770,233,904 points

Is this a big deal?

This might be the largest collection of Open Data GPS points published. Do you know of larger collections? Tell us in the comments.

Working with this file might not be your cup of tea. Over time, I expect that tools will emerge from the community to make this data easier to manage. For now, it is raw and it is extensive.

All of this data has been previously available to OpenStreetMap contributors in other forms, via editors and the web site. This file provides a new way to get the same data and to get all of it at once.

Example data

If you do decide to work with the file, this is the format that you can expect.

What format is that?

These are comma separated, raw lat / lon coordinates in a simple text format. To get the coordinates divide each number by 10**7. The points are sorted by location, starting in the far southeast of the globe (90 S,180 E) and moving northwest.

Thanks

Thanks as always to the hundreds of thousands of OpenStreetMap contributors over the seven-plus years of the project so far. Thanks to the syadmins for moving this data to a place where we can all access it.

This version of the GPS data file is CC-By-SA and published by OpenStreetMap and Contributors. The image in this article is a visualization of some of this point data in Europe. The image is licensed similarly and was created by Dave Stubbs.