Main menu

Post navigation

Open Source Software is Important for Modern Science

A Call for Open Crowdsourcing Platforms

Humanity is gaining new abilities that are opening frontiers for scientific research. The proliferation of smartphones allows us to create ad-hoc distributed data collection networks that we can use to solve problems we previously thought to be unsolvable. We can now crowdsource live data through active user inputs as well as connected sensors, and we can do this on a previously unobtainable scale, both locally and globally.

However, we are simultaneously facing grave dangers that jeopardize the stability of our global civilization. Our desire for a rapidly increasing quality of life has caused runaway problems like global climate change – we are not currently on a sustainable path to the future. Not coincidentally, our greed has caused serious problems and, potentially, some of the solutions.

There are many problems that we might solve with the large-scale crowdsourcing of data obtained through smartphones. Waze has made significant progress collecting live traffic data which is used to help users avoid traffic, thus potentially lessening the overall negative effects of heavy traffic and therefore pollution. Some researchers are working to build live Earthquake detection and response technology by using accelerometers in stationary smartphones to detect seismic wave signatures. Others are working on crowdsourcing noise data to build noise pollution maps. I’m developing a network of atmospheric pressure sensors to better understand Earth’s atmosphere, with a direct goal of building significantly improved weather models: this is pressureNET. The exciting new abilities that we gain from smartphones seems limited only by our imaginations.

Some of the problems that we’re facing currently, as a species, are urgent. Global climate change, for example, is a very difficult problem that’s already getting away from us. Learning how our actions affect our planet is only one step, and it’s a difficult one. Even more difficult is changing our behaviours so that we can live sustainably on Earth and elsewhere.

So far, we have been too slow. We took too long to figure out how our pollution was changing the atmosphere. We’re taking too long communicating these concepts to everyone. We’re taking too long developing technologies to minimize our future impact. Every year that goes by, our problems get worse and even though the work we’re doing may be good, so far it’s not good enough. Speed matters.

This is why I build open source software, and specifically why pressureNET is open source. Sharing the work that we do, especially in large-scale, crowdsourced data networks, can enable others to work on newer and harder problems. Forcing everyone to reinvent the wheel is not a fast way to get to the future, especially when there are an unknown – but large – number of problems that may be solved with the solution. There are some merits to keeping source code closed; for example, proprietary solutions may result in greater diversity of potential solutions since everyone is forced to solve the same problem their own way. However, this does not outweigh the positives that result from sharing your work, given the low barriers to entry in software development that enable both code to be shared and solutions to be reinvented.

This concept doesn’t end with source code and applications. We should also be sharing entire platforms for data collection and analysis to further reduce the duplication of work that is prerequisite to science research. To my knowledge there does not yet exist a framework for software developers to “plug and play” crowdsourced data networks. The most promising project seems to be Code in the Air, developed by MIT’s Networks and Mobile Systems group. However, CITA doesn’t yet solve all the problems needed to make it simple for software developers to build arbitrary crowdsourcing applications.

I wish I had understood the need for a general purpose, open source crowdsourcing platform sooner, because I would have built it. Instead, I’ve written a more narrowly focused software development kit that makes it simple to collect atmospheric pressure data on Android. This project is the pressureNET SDK which we use to build the core of the pressureNET app. The code is designed to make it simple for developers to integrate atmosphere sensor crowdsourcing into their own apps, both to contribute to the global data collection effort and to pull results from it.

It’s clear now that we must expand the scope of the pressureNET SDK, one step at a time, to make it into a general purpose smartphone crowdsourcing platform. The goal is that it should be a simple task for any developer (on any platform) to build a sensor- or user-based crowdsourcing app. It should be even simpler for these networks to connect together and for the results to be shared. This is a big project and I’m pretty busy right now, so it might take me too long to build this. If these ideas excite you, I invite you to either check out pressureNET’s SDK and expand it to a greater scope, or to build your own from scratch – either way, I urge you to make your solution open source so we can all share the benefits of faster scientific progress.

3 thoughts on “Open Source Software is Important for Modern Science”

great insights. I agree that open source solutions do increase the quality and speed to market. I think an issue concerning software development is that we (software engineers) were reluctant to use things like open source frameworks, because of bloat and features not needed. However, now we’ve have frameworks that allow for extensible middleware (like OWIN) that we can share and customize. I think for an open source project to succeed, we need to be able to pay close attention to a very modular architecture. thanks.

I couldn’t agree more. That’s why I started building an Integrated Research Environment three years ago. I am finishing the architecture and open sourcing it. Please contact me if you are interested in helping develop this.