Since its first edition in 2004, the association’s annual conference has been the premier global arena bringing together highly engaged digital journalists, multimedia producers, content editors, technologists, programmers, designers and newsroom decision-makers from major media markets, independent websites and leading academic institutions.

Also this year hundreds of participants converged from all around the world to meet and learn about the latest software and hardware tools for content management, search and distribution platforms, to discuss advancements and challenges in the industry and to network face-to-face in order to share best practices.

After the official inauguration on 28 October, the following two days featured an intensive marathon of thematic sessions where prestigious speakers reviewed the current state of art in all aspects of online journalism.

APIs and social networks: The revolution of news distribution

Day 1 took off with the latest fashion of technology-driven collaborative journalism: ‘Contents-Sharing through APIs’. This was the title of the panel with Delyn Simons, director of platform strategies at Mashery.com, leading provider of customised platforms through which online media can enable third parties to re-use and present their contents in all kinds of new ways, thus expanding visibility and users.

Delyn outlined case-studies of news organisations using Mashery services, such as the New York Times, USA Today and, in particular, the Guardian which has just launched its Open Platform Webfeed. By logging in with a personal API key anybody can access and organise data from the British news daily, and possibly remix them with her/his own data, in order to create original online products for either a personal web platform or the Guardian’s website.

The parallel session ‘Rethinking Online Commenting’, moderated by Alicia Shepard, ombudsman at National Public Radio site NPR.org, discussed newsrooms’ policies for users’ engagement. The same topic was covered in a more technical detail at the panel ‘Social Media Storytelling’ where Zach Seward, social media editor at the Wall Street Journal (WSJ), unveiled the secrets for a successful use of Twitter and Facebook when reporting a story.

“One of the first steps we take is trying to identify what the potential community or audience is. Usually that is as simple as me asking a reporter about groups and existing communities around his or her subject area”, Zachs says, “Then it’s figuring out how to get in front of and be a part of that community. That’s doesn’t mean you have to have Facebook, Twitter or a Digg account for every project or reporter”.

Zachs made the concrete case of the Facebook page created by the WSJ to document a Haitian-American’s mission to rescue his family in Port-au-Prince soon after the earthquake. “Our foreign editor had an idea to tell the story in real time. We thought of the best way to make that happen, and a Facebook page with its status updates seemed to be particularly useful”.

How to preserve news quality in the online environment

Besides enhancing contents distribution, technology can also help improving contents production. One of the most powerful examples is ContentCloud.org, a new open-source semantic-web platform which makes primary source materials easier to scour, annotate and share.

At the panel named after his own company, Jeremy Ashkenas, lead developer at DocumentCloud, showcased a number of investigations conducted by news outlets across the US through using DocumentCloud as a workspace where reporters upload documents, share them with their team and do structured searches and analyses based on extracted entities: The people, places, and organizations mentioned in the text of the documents.

In-depth journalism was also the theme of the panel ‘The New Investigative Journalism Ecosystem’ where Charles Lewis and Kevin Davies, respectively founder and CEO of the new InvestigativeNewsNetwork (INN) explained how the number of global non-profit reporting organizations (many of them INN members) has exploded, from three in 1990 to more than 30 today, and how they use web tools and platforms to collaborate and make public interest journalism available to an increasing number of online users.

But how accurate reporting can survive at a time where journalists can use more and more online sources which are not always reliable? An attempt to answer this challenging question was made by Solana Larsen, managing editor at GlobalVoices, at the panel ‘Tools for Crisis Reporting’.

According to Solana, journalists often belong to two opposite and extreme categories: On the one hand, you have those who rely too heavily on social networks without doing any background checks or speaking with real people; on the other hand, you have those who rely on official sources only and don’t look for unreported local voices scattered across the web.

GlobalVoices platform intends to fill this gap through helping journalists use alternative sources of information in an appropriate way. How? “Unless you talk to somebody who knows well enough the blogosphere of a given country you cannot understand if what is published on a specific blog is representative of a general trend or not”, Solana says, “GlobalVoices aggregates comments on each issue from all local blogs in order to provide a more accurate and diversified picture”.

More HiTech, more news

Day 2 was marked by the panel ‘Ten Tech Trends in ’10’ where Amy Webb, CEO at her own consultancy company Webbmedia Group, highlighted the latest digital tools and their application to online journalism.

Let’s start with what is called Geofencing. “Network mobile applications can now literally locate people in a defined space”, Amy says, “That implies a radical change for hyperlocal journalism. Today people go to a website, type a zip code and get local news. Tomorrow, with Geofence, people can run a mobile app which allows their phone to be identified in a given space and receive automatically news updates related to that specific location. Users will no longer follow the news. The news will follow them anywhere they go”.

Locating people is also possible through Sensor Technology. “Just put sensors in cloths and coffee cups to keep track of everything people are doing”, Amy says, “There are a lot of opportunities for reporting, but also a lot of privacy concerns. Data can be uploaded on the web where reporters can look for them and use them to write their stories”.

Once you have got the information you were looking for, the next step is delivering it to your users according to their specific needs. “Flipboard.com is a dynamic content generation platform which allows users to select twitter feeds, Facebook accounts, and other web sources on their favourite topics and creates automatically paginated online magazines displaying updates on such topics”, Amy says.

The last sessions focused on news apps, including those which help make public data available in a more users-friendly way, tools for data visualization and techniques for video-shooting, which completed the hyper-tech-gallery which already included web design and search engines during Day 1.

Stefano Valentino is an Italian journalist based in Brussels. Since 2008 he has been operating his own EU online customised information service EuroReporter.eu. In 2008 he founded the no profit association Reporters for an Interactive, Cooperative and Specialzied Information (Ricsi).

Here are thoughts from Tony Hirst, one of the first adopters and success stories for the Guardian’s Open Platform, on what the OP’s DataStore is and is not doing, in terms of data curation (or gardening). He asks:

“Is the Guardian DataStore adding value to the data in the data store in an accessibility sense: by reducing the need for data mungers to have to process the data, so that it can be used in a plug’n’play way by the statisticians and the data visualisers, whether they’re professionals, amateurs or good old Jo Public?”

Hirst has a number of queries in regards to data quality and ‘misleading’ linking on the Guardian DataBlog. In a later comment, he wonders whether there is a ‘data style guide’ available yet.

If you’re not all that au fait with the data lingo, this post might be a bit indigestible, so we’ll follow with a translation in coming days.

Journalism.co.uk sent OU academic, mashup artist and Isle of Wight resident, Tony Hirst, some questions over. Here are his very comprehensive answers.

What’s your primary interest in – and motivation for – playing with the Guardian’s Open Platform?TH: Open Platform is a combination of two things – the Guardian API, and the Guardian Data store. My interest in the API is twofold: first, at the technical level, does it play nicely with ‘mashup tools’ such as yahoo pipes, Google spreadsheet’s =importXML formula, and so on; secondly, what sort of content does it expose that might support a ‘news and learning’ mashup site where we can automatically pull in related open educational resources around a news story to help people learn more about the issues involved with that story?

One of the things I’ve been idling about lately is what a ‘university API’ might look at, so the architecture of the Guardian API, and in particular the way the URIs that call on the API, are structured is of interest in that regard (along with other APIs, such as the New York Times’ APIs, the BBC programmes’ API, and so on).

The data blog resources – which are currently being posted on Google spreadsheets – are a handy source of data in a convenient form that I can use to try out various ‘mashup recipes’. I’m not so interested in the data as is, more in the ways in which it can be combined with other data sets (for example, in Dabble DB) and or displayed using third party visualisation tools. What inspires me is trying to find ‘mashup patterns’ that other people can use with other data sets. I’ve written several blog posts showing how to pull data from Google spreadsheets in IBM’s Many Eyes Wikified visualisation tool: it’d be great if other people realised they could use a similar approach to visualise sets of data I haven’t looked at.

Playing with the actual data also turns up practical ‘issues’ about how easy it is to create mashups with public data. For example, one silly niggle I had with the MPs’ expenses data was that pound signs appeared in many of the data cells, which meant that Many Eyes Wikified, for example, couldn’t read the amounts as numbers, and so couldn’t chart them. (In fact, I don’t think it likes pound signs at all because of the character encoding!) Which meant I had to clean the data, which introduced another step in the chain where errors could be introduced, and which also raised the barrier to entry for people wanting to use the data directly from the data store spreadsheet. If I can help find some of the obstacles to effective data reuse, then maybe I can help people publish their data in way that makes it easier for other people to reuse (including myself!).

Do you feel content with the way journalists present data in news stories, or could we learn from developers and designers? TH: There’s a problem here in that journalists have to present stories that are: a) subject to space and layout considerations beyond their control; and b) suited to their audience. Just publishing tabulated data is good in the sense that it provides the reader with evidence for claims made in a story (as well as potentially allowing other people to interrogate the data and maybe look for other interpretations of it), but I suspect is meaningless, or at least of no real interest, to most people. For large data sets, you wouldn’t want to publish them within a story anyway.

An important thing to remember about data is that it can be used to tell stories, and that it may hide a great many patterns. Some of these patterns are self-evident if the data is visualised appropriately. ‘Geo-data’ is a fine example of this. It’s natural home is on a map (as long as the geo-coding works properly, that is (i.e. the mapping from location names, for example, to latitude/longitude co-ordinates than can be plotted on a map).

Finding ways of visualising and interacting data is getting easier all the time. I try to find mashup patterns that don’t require much, if any, writing of computer programme code, and so in theory should be accessible to many non-developers. But it’s a confidence thing: and at the moment, I suspect that it is the developers who are more likely to feel confident taking data from one source, putting it into an application, and then providing the user with a simple user interface that they can ‘just use’.
You mentioned about ‘lowering barriers to entry’ – what do you mean by that, and how is it useful? TH: Do you write SQL code to query databases? Do you write PHP code parse RSS feeds and filter out items of interest? Are you happy writing Javascript to parse a JSON feed, or would rather use XMLHTTPRequest and a server side proxy to pull in an XML feed into a web page and get around the domain security model?

Probably none of the above.

On the other hand, could you copy and paste a URL to a data set into a ‘fetch’ block in a Yahoo pipe, identify which data element related to a place name so that you could geocode the data, and then take the URL of the data coming out from the pipe and paste it into the Google maps search box to get a map based view of your data? Possibly…

Or how about taking a spreadsheet URL, pasting it into Many Eyes Wikified, choosing the chart type you wanted based on icons depicting those chart types, and then selecting the data elements you wanted to plot on each axis from a drop down menu? Probably…

What kind of recognition/reward would you like for helping a journalist produce a news story? TH: A mention for my employer, The Open University, and a link to my personal blog, OUseful.info. If I’d written a ‘How To’ explanation describing how a mashup or visualisation was put together, a link to that would be nice too. And if I ever met the journalist concerned, a coffee would be appreciated! I also find it valuable knowing what sorts of things journalists would like to be able to do with the technology that they can’t work out how to do. This can feed into our course development process, identifying the skills requirements that are out there, and then potentially servicing those needs through our course provision. There’s also the potential for us to offer consultancy services to journalists too, producing tools and visualisations as part of a commercial agreement.

One of the things my department is looking at at the moment is a revamped website. it’s a possibility that I’ll start posting stories there about any news related mashups I put together, and if that is the case, then links to that content would be appropriate. This isn’t too unlike the relationship we have with the BBC, where we co-produce televlsion and radio programmes and get links back to supporting content on OU websites from BBC website, as well as programme credits. For example, I help pull together the website around the BBC World Service programme Digital Planet, which we co-produce every so often. which gets a link from the World Service website (as well as the programme’s Facebook group!), and the OU gets a mention in the closing credits. The rationale behind this approach is getting traffic to OU sites, of course, where we can then start to try to persuade people to sign up for related courses!

Computer-assisted reporting (CAR) is nothing new, but innovations such as the Guardian’s launch of Open Platform, are leading to new relationships and conversations between data/stats experts, programmers and developers, (including the rarer breed of information architects), designers, and journalists – bringing with them new opportunities, but also new questions. Some that immediately spring to mind:

How do both parties (data and interactive gurus and the journalists) benefit?

Who should get credit for new news stories produced, and how should developers be rewarded?

Will newsrooms invest in training journalists to understand and present data better?

What problems are presented by non-journalists playing with data, if any?

What other questions should we be asking?

The hashtag #datajourn seems a good one with which to kickstart this discussion on Twitter (Using #CAR, for example, could lead to confusion…).

So, to get us started, two offerings coming your way in #datajourn part 2 and 3.

Please add your thoughts below the posts, and get in touch with judith@journalism.co.uk (@jtownend on Twitter) with your own ideas and suggestions for ways Journalism.co.uk can report, participate in, and debate the use of CAR and data tools for good quality and ethical journalism.