Links

Archives

Linus’ Law and OpenStreetMap

One of the interesting questions that emerged from the work on the quality of OpenStreetMap (OSM) in particular, and Volunteered Geographical Information (VGI) in general, is the validity of the ‘Linus’ Law’ for this type of information.

The law came from Open Source software development and states that ‘Given enough eyeballs, all bugs are shallow’ (Raymond, 2001, p.19). For mapping, I suggest that this can be translated into the number of contributors that have worked on a given area. The rationale behind it is that if there is only one contributor in an area he or she might inadvertently introduce some errors. For example, they might forget to survey a street or might position a feature in the wrong location. If there are several contributors, they might notice inaccuracies or ‘bugs’ and therefore the more users, the less ‘bugs’.

In my original analysis, I looked only at the number of contributors per square kilometre as a proxy for accuracy, and provided a visualisation of the difference across England.

During the past year, Aamer Ather and Sofia Basiouka looked at this issue, by comparing the positional accuracy of OSM in 125 sq km of London. Aamer carried out a detailed comparison of OSM and the Ordnance Survey MasterMap Integrated Transport Network (ITN) layer. Sofia took the results from his study and divided them for each grid square, so it was possible to calculate an overall value for every cell. The value is the average of the overlap between OSM and OS objects, weighted by the length of the ITN object. The next step was to compare the results to the number of users at each grid square, as calculated from the nodes in the area.

The results show that, above 5 users, there is no clear pattern of improved quality. The graph below provide the details – but the pattern is that the quality, while generally very high, is not dependent on the number of users – so Linus’ Law does not apply to OSM (and probably not to VGI in general).

From looking at OSM data, my hypothesis is that, due to the participation inequality in OSM contribution (some users contribute a lot while others don’t contribute very much), the quality is actually linked to a specific user, and not to the number of users.
Yet, I will qualify the conclusion with the statement that further research is necessary. Firstly, the analysis was carried out in London, so checking what is happening in other parts of the country where different users collected the data is necessary. Secondly, the analysis did not include the interesting range of 1 to 5 users, so it might be the case that there is rapid improvement in quality from 1 to 5 and then it doesn’t matter. Maybe the big change is from 1 to 3? Finally, the analysis focused on positional accuracy, and it is worth exploring the impact of the number of users on completeness.