How can we become better open data producers?

Our session host, Dan Barrett, head of data and search at the UK Parliament, noted that he’d heard two clear messages from the conference so far:

We need to work ever more closely with the users of the data

Need to avoid working on a technical solution that makes an assumption about who our users are and how they will use the data

What else would make data producers better?

Working with people who consumer things — with their eyes or wit machines — is really important. One message from a recent international conference was that we need to stop building platforms and start building portals. Give people the data, but also give them some use of it, and some examples. Whose responsibility is this? The data publishers?

Overall, people need to be better at working with data.

There’s a difference in lexicon. For example, registers are a technical product, with a technical lexicon. However, most people who make decisions about the use of the product are not of a technical bent. How do we up-skill them? How do we make people ask questions of the data they get? Sometimes the sheer fact there’s a crown logo on data means people trust it.

There was a visualisation showing that the poor had got poorer and the rich richer, that was misleading – and you needed fullfact.org to understand that.

Across government there’s an assumption that we publish for everyone – which is cool. But you have to make the hard decisions about how is the likely audience, and how do you reach them, and help them use the data. Increasingly it’s clear that just putting the data out there isn’t enough.

Is there any use of personas in the data publishing process? There’s some – but it’s a reasonable way of assessing the different needs and skill levels. Having a practice of working in close proximity to likely users is good. You can iterate your services to better serve them. The communities session yesterday highlighted that there might well be people who aren’t aware of where the data they need is. Who is working with them? There’s the possibility of a very different way of working there.

What’s the version of YouTube’s recommendation algorithm for the interested citizen who wants to know more about, say, vaccinations at a political level? Can you show examples of what a data set has been used for, data sets it’s linked too, cool examples of it in use?

On the other hand, it’s hard enough to get people to publish at all. Making it harder will just reduce the amount of data published. It sounds like there’s a role for someone else to do this, not the provider. This is a form of journalism. It’s about embedding data journalism around it. The ONS is hiring data journalists who are there to work with other media organisations. A partnership with a big news organisation jumps site visions up to a couple of million.

Information curators – playlist curators – is an emerging job. It’s a solution for dealing with abundance of information. You could have curators working for either your technical consumers or your general consumers. Do we need human or algorithmic curation?

Would a better culture of metadata help? Do we need to own up more to what we actually have? People who now data can tell people about its existence! But can they describe it usefully? Data.gov.uk is a big bucket – and search is not necessarily the best solution to mine it useful. Can you curate examples of the best datasets, the way they can be used together? How do you ask for that without deterring the publisher?

There’s a tension between human and algorithmic curation – and the algorithmic curation tools are getting better. Those tools will emerge on the market, and there’s a responsibility to use them. There are some structural implications about exactly how you publish the data technically, so it will appear in some of these search functions.

Could there be a human practice responsible for improving and encouraging use of your data? Not compulsory – an optional support?

Algorithmic angels – the concept of an AI you have a conversational relationship with that monitors your digital behaviour and warns you about actions you are about to take where you don’t understand the ramifications. Could a version of that exist with public services?

A human (or set of humans) that can see the links across what’s being provided by different departments or services? A taxonomy can help that – and that value should spread. Standard dates give you a date pickers. Standard geo-references give you automatically embedded maps, and so on.