Insider views about technology from the front line

The Internet of Things, a term being bandied to the point of almost meaninglessness now it’s hit the mainstream of the NYT and the BBC. Yet, while the mainstay of the media struggles to describe how and why smart sensor arrays are going to mean you spend less time in traffic, ultimately pay more for your electricity but make sure your fruit is always fresh, there is a quiet revolution taking place.

The action taking place is the creation of what I call the Sensor Commons. Why is this a revolution? Because as a population we are deciding that governments and civic planners no longer have the ability to provide meaningful information at a local level.

Two posts summarise this activity and its implications beautifully for me.

The first, by Ed Bordern from Pachube, is on the creation of a community driven Air Quality Sensor Network. His passionate call to arms highlights that we have no realtime system for measuring air quality. Further, what data does exist and has been released by governments is transient due to the sampling method (ie that the sensor is moved from location to location over time). Summarising a workshop on the topic, he discusses how a community oriented sensor network can be created, funded and deployed.

The implications of this quiet revolution are discussed by Jauvan Moradi in his post on how open sensor networks will affect journalism. Jauvan discusses how citizen data will re-establish the localised roots of journalism by reporting on issues that matter locally and with accurate, real time data to help drive the story. Obviously Jauvan has an interest in media so he’s taking that slant yet this is but one of the many implications of the Sensor Commons.

We don’t know what we’re going to get when we arrive at a point where there is hyperlocalised data available on any conceivable measure – sound levels, temperature, rain levels, water quality, air quality, the number of cars passing a location in real time. The needs are going to be driven purely by local communities – by bottom-up interest groups that have access to cheap technologies to enable the sensor creation as well as a local need or concern that drives action.

Through observation of many of these projects, as they mature one of the issues I have is that many of these endeavours require deeply technical knowledge in order to be effective. For the true Sensor Commons, as I see it, we need to have deep engagement with the population as a whole, regardless of technical ability or knowledge.

What is the Sensor Commons?

Before I get into the fundamental requirements of a Sensor Commons project it’s worth defining what I mean by the term. For me the Sensor Commons is a future state whereby we have data available to us, in real time, from a multitude of sensors that are relatively similar in design and method of data acquisition and that data is freely available whether as a data set or by API to use in whatever fashion they like.

My definition is not just about “lots of data from lots of sensors” – there is a subtlety to it implied by the “relatively similar in design and method of data acquisition” statement.

In order to be useful, we need to ensure we can compare data relatively faithfully across multiple sensors. This doesn’t need to be perfect, nor do they all need to be calibrated together, we simply need to ensure that they are “more or less” recording the same thing with similar levels of precision and consistency. Ultimately in a lot of instances we care about trended data rather than individual points so this isn’t a big problem so long as an individual sensor is relatively consistent and there isn’t ridiculous variation between sensors if they were put in the same conditions.

In my definition of the Sensor Commons geography doesn’t matter. You can be as hyper localised as measuring the sewage level of a borough as in the case of Don’t Flush Me or measuring radiation on a global scale. The scale upon which you operate is dictated by the type of thing you’re measuring. For example measuring water quality in two unlinked water courses makes almost no sense, in two different oceans it makes even less with regards to comparability.

The 5 requirements of the Sensor Commons.

We’re at a very early stage of the Sensor Commons and attempting to define it may be foolish, however by observing many different types of projects around the world I believe there are five critical requirements for getting a Sensor Commons project off the ground and making it a viable endeavour. A Sensor Commons project must:

Gain trust

Become dispersible

Be highly visible

Be entirely open

Be upgradeable

Each of these will be dealt with in the sections below. A project that has these characteristics will generate a momentum that will exist beyond a core group of technical evangelists and will find a home within the mainstream community.

Gaining trust

Many sensor commons projects shine the light on our human behaviour. Ostensibly the goals are noble – to try and understand our environs such that we can make them better and change our behaviour – yet we must stay on the side of data and fact and not move towards blame; others can carry that torch. For example the project that seeks to close down the local smoke stack due to its impact on air quality will have a hard time fostering trust due to their agenda. We all want to have clean air but my kids go to school with the kids whose parents work in said smoke stack – how will I look at them when they lose their jobs?

In the section on dispersal I’ll talk about using existing community assets and infrastructure and trust plays a part in this. If you are piggy-backing the local library’s WiFi so you can get a network connection down in your stream bed it is imperative you don’t abuse their network by sending or requesting too much data – or harvest anything you shouldn’t do.

Trust is provided by having stated project objectives, clear policies around what data you’re going to capture, where it will go and how it will be used and available. Someone responsible for dealing with these issues and being the “go to person” for any issues or questions that arise will provide credibility as well as probably opening up some opportunities for partnership as well.

Note how trust requires no technology, merely an understanding of it. This is a perfect role to engage non-technical team members in, especially those who can articulate why the project is important to the community.

As an example the Don’t Flush Me team have done an excellent job of this, they have built trust with the authorities who are granting them access to the sewerage system – there’s no blame being cast, they are simply trying another way to help a community known problem. Similarly they are building trust with the community by creating a valuable resource for people who care about their local environment.

Become dispersible

One of the biggest issues facing Sensor Commons projects is that of dispersion. Projects that seem like such a good idea fall at the hurdle of widespread adoption. Understanding how you can disperse your sensors properly means that like a dandelion seed on the wind you’ll find plenty of places to put down and ensure success.

There are many factors that contribute to this which are discussed below:

Price

This is pretty obvious but can go overlooked. What is the total cost of the sensor? Don’t forget that as well as the material cost of components you need to factor in someones time to build the sensor (especially if it’s short run and will be hand built by the project team). You also need to factor in ongoing cost – for example if you have a remote sensor that uses the 3G network you’ll need a data plan that is paid for month by month. Similarly if it breaks down can it be fixed (if so for how much) or is it a straight replacement?

Price is a big factor in dispersal. Taking Dont Flush Me as an example, the cost of the sensor and the data plan make the project unwieldy without donations. While I’m sure this will work in the end, this isn’t the path towards quick dispersal. Contrast this with say the radiation data gathered by individuals globally – while the sensors themselves were relatively expensive, the network cost was negligible and thus led to higher uptake.

Tapping into local assets

If you can gain trust with the community then you get the opportunity to try and use community assets to help with dispersal. If you can use things like WiFi on your project why not talk people locally and see if they would be willing to “host” one of your sensors on their network so long as you don’t do anything silly. I used to belong to a kitesurfing group in the UK and we wanted to get a networked wind meter at our local beach so we could see if it was windy enough to go kitesurfing. As we all drank in the local pub on the beach anyway, the owner allowed us to mount a weather station on his roof and use his Internet connection so we could publish the data so we could all see it.

Do you have a local library that is on a main road? Might make a good location for an air quality sensor that uses their WiFi to stream the data back up. Libraries, schools, local council buildings are all community infrastructure – it’s worth a conversation to see if you can use it for your project.

In Melbourne you can walk along some of the suburban creeks that run into our bay and never be out of range of a WiFi connection for its entire length. Surely some of the people who live along that river would have an interest in the quality of the water and would share their WiFi with you? If you’re a local organisation you probably know some of them or know someone that knows some of them already.

Utilising local assets can dramatically drop the cost of a project, meaning more units, greater dispersal and better community engagement too.

Units should be self contained

The sensors themselves should be as simple and self contained as feasibly possible. Utilising batteries or solar power makes sense and if you can use WiFi then it’s even better. WiFi modules for things like Arduino use are becoming pretty cheap now so won’t blow your budget too much. It’s still cheaper to run cable if you can however this is another barrier to dispersal. It’s one thing to ask someone if you can put the sensor in the creek next to their house and send the data through their WiFi, it’s another entirely to ask them to route a cable across their yard, in a window and across a room to plug it into their router.

Don’t underestimate the cable factor – I’ve had my wonderful and generally relaxed wife draw a line at visible cables taped to the decking so I could get data from the back yard into the house.

What level of technical knowledge is required to deploy?

To gain higher levels of dispersal, drop the technical knowledge required to deploy. There’s a reason why Linksys and Netgear own the home router market – because anyone with some very basic instructions could deploy a box and get their home Internet up and running reliably. If you have a difficult package to deploy it means your technical members of your project will need to do it. This may not be a problem if you’re doing small scale projects but if you have say hundreds of nodes this becomes an issue.

An API makes your data dispersible too

Once you have your data wrap it in an API so it becomes dispersible too. This doesn’t need to be a grand piece of software engineering, either let me have all of it and document what you’ve got or else provide me a method of querying your data over a period (from X to Y) for the entire node array or individual nodes in the array. Make it lightweight and expressive, such as JSON and you’ll provide a data set that can be readily used, integrated into other systems or mashed up with other data sources easily.

Adopt permissive licensing

Permissive licensing for your hardware, software and data allows it to be used and improved upon by others. You probably haven’t considered all of the uses people will come up with for your project so let others help you.

Be highly visible

There are two aspects of visibility that should be considered; first the visibility of the device itself and second, the visibility of the data created.

With respect to the sensor itself if it is in a public place then you should endeavour to make it visible and also provide information about what it is there for. Occasionally you’re going to get vandals trash your stuff – there’s not much you can do about it. However if you take the opportunity to explain what it is and what the project is about then it becomes harder for someone to vandalise a community project than something put there “by the man”.

Imagine engaging with a local council that has a display on the side of their building showing what the overall air quality score was in real time for the borough? These sorts of Civic Displays could become quite common place as different projects feed data into them. There’s probably an opportunity for civic art to incorporate data from these types of projects and display it in interesting ways to the local population.

By creating visibility of the data we can raise awareness or affect behaviour which is often the goal for many of these projects.

Data should be visible online as well – not simply by making the data sets available but also highlighting some meaning as well. What I found most interesting about the self-assembly of the radiation data on pachube in the wake of the Fukushima incident was that it wasn’t “real” until it was on a google map. Prior to that point there were dozens of data streams but it was too hard to interpret the data. Making your data visible in this instance means making it approachable for people to gain understanding from it.

Be entirely open

Openness in this day and age is almost expected but it’s worth pointing out that the projects that open source all of their code, schematics and data will do better than those that don’t.

The other part of openness however is about the wider project context. This type of openness is about the transparency of the project objectives and the findings, documenting any assumptions about your data such as it’s likely error rate and whether you’re doing any manipulation of the raw data to derive a metric.

Government data sets and sensor networks are steadfastly closed but there is a lot of weight paid to them because they have an implied lack of error and high precision. Ostensibly this is because they are supposed to be “well engineered”, rigorously tested and highly calibrated devices – why else would one sensor cost $50,000?

With radiation data on pachube as an example, there was much made in April about how reliable it was given that it wasn’t calibrated, the sensors were probably sitting on peoples’ windows and that they were only consumer grade. Precision was never the intent for those deploying the sensors however so the argument was moot – ultimately the point was to assess trend. If my sensor has an accuracy level of ∓ 20% then it’s always going to be out – probably by a similar amount. However if it goes up consistently over time, even though it’s out by 20% the trend is still going up – and I wouldn’t have known about that unless I was using a more deployable sensor because the government one is probably 200km away.

Having a culture of openness and transparency makes up for the error and lack of precision in the data. By “showing your workings” it opens up your data and method for critique and from there allows room for improvement. It also provides a method by which you can agree or disagree with the assumptions if you want to use the data and make an informed decision underpinning the data set.

Be upgradeable

The final requirement is to be upgradeable. One of the benefits of Moore’s Law is that not only do we get more computing power for the same price over time but that we get the same computing power for less dollars over time. Consider a humble arduino – something that is more powerful for about $40 than a multi-thousand dollar 286 PC back in the late 80s.

Being able to upgrade your sensor network allows you to take advantage of all the developments happening in this space. Adequately modularising your components so they can be switched out (eg switching to WiFi from cabled Ethernet) as well as abstracting your code (not doing heavy processing on your sensor, offloading it to the acquirer then processing it) make upgrading easy over time.

This means your project gets better over time rather than stagnating.

The Sensor Commons Future

Smart Cities are all well and good and IBM, Cisco and others are more than welcome to their ideas and products to make our urban infrastructure more clever – we need it more than ever now. For me this vision is narrow in that the top-down view made from a very tall tower provides an architecture that doesn’t seem to solve problems at a local level. Humans, by our nature are highly localised beings – whilst we may have to travel long distances to work we only travel a few kilometres from where we live and work once we’re there. As such we develop profound connections to our local environments – this is why we see “friends of” groups spring up for local parks, creeks or other reserves and why communities lobby so heavily for protection of these spaces. This type of technology enables us to interact with our environments differently.

If you think this is all naive data-driven techno-utopia think again.

Governments are starting to look at ways they can push their data into platforms like Pachube to make it accessible. Germany is in the process of doing this with its radiation data.

Individuals and project groups are already using tools like Pachube, Thingspeak and Open Sense to aggregate data from their local environment (eg: C02 levels).

It’s becoming almost trivially easy to create the sensors and the web tools are there to hold the data and start the process of understanding it. The chart below shows the temperature in my back yard in real time for the last week.

The access we are getting to cheap, reliable, malleable technologies such as Arduino and Xbee coupled with ubiquitous networks whether WiFi or Cellular is creating an opportunity for us to be able to understand our local environments better. Going are the days where we needed to petition councillors to do some water testing in our creeks and waterways or measure the quality of the air that we are breathing.

The deployment of these community oriented technologies will create the Sensor Commons; providing us with data that becomes available and accessible to anyone with an interest. Policy creation and stewardship will pass back to the local communities – as it should be – who will have the data to back up their decisions and create strong actions as a result.

If you have a project that is creating a Sensor Commons I’d love to hear about it. I’ll list them down here as I get them.

Yes that’s an awesome example. In principle they were trying to use the WiFi data in order to help geolocation services in order to make mapping etc much better (ie this WiFi hotspot is located at this geographic position) but their approach and reaction to finding out about it destroyed any trust the public had with them and has probably resulted in a lot more pressure being applied to the Street View team as a result.

Very good post. Interesting motivation – lack of govt. support – the link to the resilience movt. is needed. The main ‘smart’ city discussions fall into the naive techno-utopia category by and large.
The transparency issue is not easy to achieve but vital. The history of Stevenson screens used for short term met data, and then abused for long term climate data is worth examining in this context.

But…. You are obviously a software guy. There is an old saying: “Never give a screwdriver to a programmer!” (I don’t remember what software guys say about hardware jockeys. ) If you want this project to succeed, then you need to get a hardware guy involved, specifically someone with an applied background in instrumentation on an industrial level. Your article glosses over many aspects of the problem that are critical to success.

For example, you downplay accuracy, but you also assume that the accuracy is constant. That is not a valid assumption. Low cost instruments drift over time. Even high-priced instruments have drift. That is why calibration programs are so important if you want to USE the data. When a calibration program is in place, THEN you can ignore the accuracy, because you have normalized it. In your proposal, accuracy is unknown and variable.

Instrumentation engineers have learned over the years that placement of the sensors affects readings, environment affects readings, all sorts of things affect readings. Even power fluctuations affect readings. Tap into someone who can help you avoid the commonly known pitfalls on data acquisition and control, and then you’ll have a project that can make a difference in the world.

Otherwise, the first time you trot out a dataset, someone is going to tear it to pieces by addressing its lack of coherence, its lack of calibration, its lack of correlation between measurement instruments, and its inability to be traced back to known measurement standards. (See NIST or the French measurement institute for details.)

I also left a comment on TechDirt about this that addresses this in more depth.

A device for your sensor commons project needs to be Scalable, Integrated, Orthogonal, Modular, Open Source, Extensible. How else do you manage and connect to 340,282,366,920,938,463,463,374,607,431,770,000,000 things

@briansj Thanks for your comment and I think the point about Stevenson screens is an important one too. Especially because originally this was about a “process” for climate data capture not specifically a device for doing so. Especially back in the early days (mid nineteenth century) where a lot of the devices for capture of this type of information (temperature, rainfall, humidity etc) were largely made by enthusiasts.

@artp Thanks for your comment and I read your comment over at TechDirt on this front as well.

First up – yes I’m primarily a software guy – one that works with a lot of data day in day out within a variety of domains doing a lot of statistical analysis and processing on them. From a hardware perspective you’d call me an “educated tinkerer” and I would be first to suggest that having people involved with more hardware experience than mine would be 1) desirable and 2) critical to the success of any project resulting from this post on a grander scale.

From an accuracy standpoint I take a hugely pragmatic view:

If I have one super accurate sensor (say +- 0.1%) located in the middle of a city that measures air pollution is this more or less useful that having 10 sensors with good accuracy (say +- 1%) scattered around the City versus a thousand with say +-10% accuracy. It’s a cost benefit analysis. At the moment we have the first in most cases, some places (eg London) are lucky enough to have the second yet it’s the third option that provides the most opportunity for understanding and insight creation – irrespective of the accuracy issue.

Would I prefer numerous, cheap, accurate devices that are well calibrated that provide public data? You bet I would! But failing that I’ll take lots of cheap inaccurate ones any day.

I’m assuming that there will be a degree of accuracy involved that will be “sufficient” for the task at hand. More importantly though, I’m interested in how even an inaccurate data set can be used to create understanding and be used to set an agenda.

Take for example the data created as a result of the Fukushima disaster. The work done by Shigeru Kobayshi and others (see his excellent maker faire pres here: http://www.slideshare.net/kotobuki/maker-faire-nyc-2011) highlighted that the data either wasn’t available, was stagnant, wasn’t in an open format or was inaccurate anyway. More importantly the lack of data that led to bad reporting and poor public understanding of the situation.

In the presentation he talks about how the community assembled and expertise surfaced that had the necessary understanding to calibrate geiger counters and the equipment – my assumption on the Sensor Commons front is that this will happen more often than not through collaboration of interested parties and the self-selection of projects to expertise. For example I’m working on a soil moisture project and have had collaboration with various people all around the world working on probe design that is “good enough” for what I’m looking to achieve. Others are taking that a lot further towards highly accurate devices but even +-5% is sufficient for my needs (crop irrigation automation). The point is, I can get access to knowledge that refines the accuracy of my solution.

BrianSJ makes a comment about Stevenson screens which I think is a good one – not least because in the mid 1800s when that design surfaced a lot of the instrumentation was relatively DIY or at least if it wasn’t it was largely manufactured close to where it was used. Stevenson established some expertise in terms of how climate information should be captured (use this set up, record at these points etc) but then the individual sensors were not calibrated to each other. The point here is that the sensors were consistent enough for their local use to show trends over time. The problem with this is that you can’t compare data from say 120 years ago with readings now as the sensors now are standardised and much better – highlighting the inaccuracy of the old solutions.

I understand that my pragmatism isn’t the best approach with respect to high fidelity data sets but it does support coverage and more importantly it creates an opportunity for discussion.

Completing the radiation discussion, Safecast are now looking to roll out hundreds of these sensors in Japan and create an Open data set supporting it. In Germany, government data has been converted to a format that can be readily uploaded into Pachube and released for the public to see, review and understand. Air Quality Egg seeks to do the same thing for air pollution and to create an agenda for discussion about air quality in European cities.

By giving more people the ability to produce and deploy these types of devices we create the opportunity for the community to create a discussion that moves away from anecdote (eg letter to the council saying the air seems rather polluted over the last 6 months) through to more objective discussion (as a community we’ve used 10 sensors in this area and we can see there is a rising level of X) which can inform policy (eg council commissioning an environmental auditor to do some research into what’s going on).