Blogging data

Programmer blogs explain the science behind the magic

Data journalism and information visualization is a burgeoning field. Every week, Between the Spreadsheets will analyze, interrogate, and explore emerging work in this area. Between the Spreadsheets is brought to you by CJR and Columbia’s Tow Center for Digital Journalism.

The outlets that consistently produce innovative and compelling pieces of data journalism are also starting to explain how they’re producing them. Developers at the Guardian, The New York Times, the Daily Beast, and ProPublica all maintain blogs that run through the process of making interactives. Like any good blog, they have distinct focuses, strong voices, and they engage directly with the community they serve: the ever-expanding band of news application developers.

The Guardian’s Developer Blog is perhaps one of the most active. It contains not only explanatory posts by interactive developers, but also pieces about wider technology trends and findings of interest to those developers.

The blog is useful for programmers looking for guides to building tools and to help them understand the wider context in which they’re working. For example, the most recent post is a piece about mobile traffic data. The focus is on the process of redesigning the Guardian’s mobile site, but it also discusses what the interactive team found while examining traffic figures. The Guardian shared the results, showing that traffic to its mobile sites rose significantly over two years and that Apple dominated the devices driving that traffic. The analysis informed the Guardian’s own mobile site redesign. In sharing its figures, outlets considering a similar project have a comparison point for thinking about their own needs.

The blog has all the hallmarks of the Guardian’s open approach to news. Its developers aren’t just sharing guides and lines of code so that others can copy their work; they’re also sharing their processes, findings, and thinking about the wider issues of news development. In doing so, the Guardian has created a community and significantly added to the news development conversation.

The Daily Beast’s News Beast Labs is one of the youngest members of the news developer blogs. It’s run by the Beast’s data reporters, social media editors, and developers. Its purpose is straightforward: “On this Tumblr, we’ll talk about how the technology we used helped us tell the story we were after and what decisions went into displaying a story in a particular way.”

To that end, posts on the Tumblr are all ‘behind-the-scenes’ guides on how teams produced stories. More importantly, these posts are infused with the rationale behind editorial decisions to use a particular technology. For example, senior data reporter Michael Keller’s post in which he explains how he made the ‘Obama Hater Book Club’ piece also discusses the potential pitfalls of using Google Docs to power the Beast’s news applications.

The Daily Beast also takes care to link out to other developers’ writing on similar topics. Again, this contributes to the wider discussion bubbling in cyberspace about news development. As Jonathan Stray wrote, “links are a currency of collaboration.” Linking out to others’ content demonstrates a media organization’s willingness to point its audience to resources they themselves don’t have is a commendable act.

The New York Times’s Open blog is similar to the Guardian’s in the sense that it contains content beyond explanatory posts about its work. In fact, Open, written by NYT developers, is filled with posts about TimesOpen events, regular hackathons, and events like the recent Hack Day. It serves as a resource for NYT’s active coding community.

ProPublica’s Nerd blog lives up to its name. It’s dense with technical terms and lines of code, and it breaks down the process that goes into producing its interactives. It’s written by the reporters and developers who worked on the pieces, giving it valuable insight into the building process. For the programmatically faint hearted some of the posts are inaccessible. To another developer, however, they read as clear as day. And because ProPublica encourages the use of its tools—they allow and encourage others both to embed their interactives and also to use their code to make their own pieces—the blog also functions as a how-to guide.

ProPublica also does the best job of promoting the Nerd Blog — it sits right up top of the Tools and Data landing page, in its own dedicated box, making it easy for anyone looking at ProPublica’s data journalism to know that it’s available.

It’s a shame to see the Texas Tribune’s On The Records blog hasn’t been updated since August. It’s more of a data blog than it is a developers’ blog in the way it explains how the Texas Tribune’s pieces are made. Some posts pick out key findings from data stories, while others are about government released data. The Tribune is doing some truly enterprising data journalism. More than anything else, the drop-off in its blog entries highlights the fact that maintaining these resources is a huge ask unto itself.

As more news outlets produce innovative pieces of data journalism, however, developers’ blogs will only continue to grow in importance. Development is a collaborative space; the more data reporters and programmers blog about their innovative news applications, the more it encourages other outlets to start making their own pieces.

Has America ever needed a media watchdog more than now? Help us by joining CJR today.

Don't Miss

In June 2003, the San Francisco company Linden Labs launched a Massively Multiplayer Online Role Playing Game called Second Life. It quickly grew to over a million users, and has become a touchstone for the potential social adoption...