During the 2008 summer Olympics, the Beijing Air Track project took a team of photographers from Associated Press and used them to smuggle hand-held pollution sensors in to Beijing. Using their press access to the Olympic venues, they gathered pollution readings to test the Chinese government’s data that a series of extreme emergency measures put in place in the run-up to the games had improved the cities notoriously poor air quality. They were not the only organisation to use sensors in this way. The BBC’s Beijing office also used a hand-held sensor to test air pollution gathering data that appeared in a number of reports during the games.

“prime example of how sensors, data journalism, and old-fashioned, on-the-ground reporting can be combined to shine a new level of accountability on official reports”.

In contrast to the Chinese data, the level of transparency displayed in the way the data was collected vividly illustrates how sensors can play a part in reinforcing data journalism role in the process of accountability.

Testing the context, provenance and ownership – where our data comes from and why – is a fundamental part of the data journalism process. If we are not critical of the data we use (and those that provide it), perhaps becoming over-reliant on data press releases , we can risk undermining our credibility with data-churnalism or, worse still, data-porn! . As data journalism practice evolves, whilst the basic critical skills will remain fundamental, it would seem logical to explore ways that we reduce our dependency on other sources all together. The Beijing project, with its use of sensors, offers a compelling solution. As Javaun Moradi, product manager for NPR digital, succinctly put it:

“If stage 1 of data journalism was ‘find and scrape data.’, then stage 2 was ‘ask government agencies to release data’ in easy to use formats. Stage 3 is going to be ‘make your own data’”

Crowdsensing data
The three stages that Moradi identifies are not mutually exclusive. Many data journalism projects already include an element of gathering new data often done using traditional forms of crowdsourcing; questionnaires or polls. As much as involving the audience has its benefits, it is notoriously unpredictable and time-consuming. But as individuals we already make a huge amount of data. That isn’t just data about us collected by others through a swipe of a loyalty card or by submitting a tax return online. It’s also data we collect about ourselves and the world around us.

An increasing number of us strap sensors to ourselves that track our health and exercise and the “internet of things” is creating a growing source of data from the buildings and objects around us. The sensors used by the AP team were specialist air pollution sensors that cost in excess of $400 – an expensive way for cash-strapped newsrooms to counter dodgy data. Since 2008 however, the price has dropped and the growing availability of cheap computing devices such as Raspberry Pi and Arduino and the collaborative and open source ethic of the hacker and maker communities, have lowered the barriers to entry. Now sensors, and the crowd they attract, are a serious option for developing data driven reporting.

Hunting for (real) bugs with data
In 2013, New York braced itself for an invasion. Every 17 years a giant swarm of cicadas descend on the East Coast. The problem is that exactly when in the year the insects will appear is less predictable. The best indicator of the emergence of the mega-swarm (as many as a billion cicadas in a square mile) seems to be when the temperature eight inches below the ground reaches 64 degrees (18C). So when John Keefe, WNYC’s senior editor for data news and journalism technology, met with news teams to look at ways to cover the story, he thought of the tinkering he had done with Arduino’s and Raspberry Pi’s . He thought of sensors.

Keefe could not find a source for the data that offered any level of local detail across the whole of New York. He took the problem of how to collect the data to a local hackathon, organised by the stations popular science show Radiolab, who helped create a “recipe” for an affordable, easy to make temperature sensor which listeners could build and send results back to a website where they would map the information

Developing collaboration.
Whilst sensors play an enabling role in both examples, underpinning both the Beijing AirTrack and Cicada projects is the idea of collaboration. The Beijing project was originally developed by a team from the Spatial Information Lab at Columbia University. Combining the access of the media with the academic process and expertise of the lab gave the project a much bigger reach and authority. It’s a form of institutional collaboration that echoes in a small way in more recent projects such as The Guardian’s 2012’s Reading the riots. The Cicada project, on the other hand, offers an insight into a kind of community-driven collaboration that reflects the broader trend of online networks and the dynamic way groups form.

Safecast and the Fukushima nuclear crisis
On 9 March 2011, Joichi Ito was in Cambridge Massachusetts. He had travelled from Japan for an interview to become head of MIT’s prestigious Media Lab. The same day a massive underwater earthquake off the coast of Japan caused a devastating tsunami and triggered a meltdown at the Fukushima Dai-ichi nuclear plant, starting the worst nuclear crisis since Chernobyl in 1986. Ito, like many others, turned to the web and social media to find out if family and friends were safe and gather as much information as he could about the risk from radiation

At the same time as Ito was searching for news about his family, US web developer Marcelino Alvarez was in Portland scouring the web for information about the possible impact of the radiation on the US’s west coast. He decided to channel his “paranoia” and within 72 hours his company had created RDTN.org, a website aggregating and mapping information about the level of radiation .

For Alvarez and Ito the hunt for information soon developed into an effort to source geiger counters to send to Japan. Within a week of the disaster, the two had been introduced and RDTN.org became part of project that would become Safecast.org. As demand outstripped supply, their efforts to buy geiger counters quickly transformed into a community driven effort to design and build cheap, accurate sensors that could deployed quickly to gather up to date information.

SIDENOTE: It will be interesting to see how the experiences of Beijing and Safecast could come together in the coverage of the 2020 Olympics in Japan

Solving problems: Useful data and Purposed conversations
Examples such as WNYC’s cicada project show how a strong base of community engagement can help enable data-driven projects. But the Safecast network was not planned, it grew

“from purposed conversations among friends to full time organization gradually over a period of time”

There was no news conference to decide the when and the how it would respond or attempt to target contributors. It was a complex, self-selecting, mix of different motivations and passions that coalesced into a coherent response to solve a problem. It’s a level of responsiveness and scale of coverage that news organisations would struggle to match on their own. In that context, Moradi believes that journalism has a different role to play:

Whether they know it or not, they do need an objective third party to validate their work and give it authenticity. News organisations are uniquely positioned to serve as ethical overseers, moderators between antagonistic parties, or facilitators of open public dialogue

Building bridges
Taking a position as a “bridge” between those with data and resources and “the public who desperately want to understand the data and access it but need help” is a new reading of what many would recognise as a traditional part of journalism’s process and identity. The alignment of data journalism with the core principles of accountability and the purpose of investigative journalism, in particular, makes for a near perfect meeting point for the dynamic mix of like-minded hacks, academics and hackers, motivated not just by transparency and accountability. It also taps into a desire not just to highlight issues but begin to put in place solutions to problems. This mix of ideologies, as the WikiLeaks story shows , can be explosive but the output has proved invaluable in helping (re)establish the role of journalism in the digital space. Whether it is a catalyst to bring groups together, engage and amplify the work of others or a way, as Moradi puts it, to “advance the cause of journalism by means other than reporting” , sensor journalism seems to be an effective gateway to exploring these new opportunities

The digital divide
The rapid growth of data journalism has played a part in directing attention, and large sums of money, to projects that take abstract concepts like open government and “make them tangible, relevant and useful to real live humans in our communities”. It’s no surprise, then, that many of them take advantage of sensors and their associated communities to help build their resources. Innovative uses of smart phones, co-opting the internet of things or using crowd funded sensor project like the Air quality egg. But a majority of the successful data projects funded by organisations such as the Knight Foundation, have outputs that are almost exclusively digital; apps or data dashboards. As much as they rely on the physical to gather data, the results remain resolutely trapped in the digital space.

“We are at a tipping point in relation to the on-line world. It is moving from conferring advantage on those who are in it to conferring active disadvantage on those who are without”

The solution to this digital divide is to focus on getting those who are not online connected. As positive as this is, it’s a predictably technological deterministic solution to the problem that critics say conflates digital inclusion with social inclusion . For journalism, and data journalism in particular, it raises an interesting challenge to claims of “combating information asymmetry” and increasing the data literacy of their readers on a mass scale .

Insight journalism: Journalism as data
In the same year as Digital Britain report appeared, the Bespoke project dived into the digital divide by exploring ways to create real objects that could act as interfaces to the online world. The project took residents from the Callon and Fishwick areas in Preston, Lancashire, recognised as some of the most deprived areas in the UK, and trained them as community journalists who contributed to a “hyperlocal” newspaper that was distributed round the estate. The paper also served as a way of collecting “data” for designers who developed digitally connected objects aimed at solving problems identified by the journalists. A process the team dubbed insight journalism .

Wayfinder at St Matthew’s church (Image copyright Garry Cook)

One example, the Wayfinder, was a digital display and a moving arrow which users could text to point to events happening in the local area.

Another, Viewpoint was a kiosk, placed in local shops that allowed users to vote on questions from other residents, the council and other interested parties. The questioner had to agree that they would act on the responses they got, a promise that was scrutinised by the journalists.

The idea was developed during the 2012 Unbox festival in India, when a group of designers and journalists applied the model of insight journalism to the issue of sexual harassment on the streets of New Delhi. The solution, built on reports and information gathered by journalists, was to build a device that would sit on top of one of the many telegraph poles that clutter the streets attracting thousands of birds. The designers created a bird table fitted with a bell. When a woman felt threatened or was subjected to unwanted attention she could use Twitter to “tweet” the nearest bird table and a bell would ring. The ringing bell would scatter any roosting birds giving a visible sign of a problem in the area. The solution was as poetic as it was practical, highlighting not just the impact of the physical but the power of journalism as data to help solve a problem.

Stage four: Make data real
Despite its successes sensor journalism is still a developing area and it is not yet clear if it will see any growth beyond the environmental issues that drive many of the examples presented here. Like data journalism, much of the discussion around the field focuses on the new opportunities it presents. These often intersect with equally nascent but seductive ideas such as drone journalism. More often than not, though, they bring the discussion back to the more familiar ground of the challenges of social media, managing communities and engagement.

As journalism follows the mechanisms of the institutions it is meant to hold to account into the digital space, it is perhaps a chance to think about how data journalism can move beyond simply building capacity within the industry, providing useful case studies. Perhaps it is a way to help journalism re-connect to the minority of those in society who, by choice or by circumstance, are left disconnected.

Thinking about ways to make the data we find and the data journalism we create physical, closes a loop on a process that starts with real people in the real world. It begins to raise important questions about what journalism’s role should be in not just capturing the problems and raising awareness but also creating solutions. In an industry struggling to re-connect, it maybe also starts to address the issue of solving the problem placing journalism back in the community and making it sustainable. Researchers reflecting on the Bespoke project noted that:

“elements of the journalism process put in place to inform the design process have continued to operate in the community and have proven to be more sustainable as an intervention than the designs themselves”

If stage three is to make our own data, perhaps it is time to start thinking about stage four of data journalism and make data real.