Open Data Institute Friday Lunchtime Lecture: "Data for Democracy"

I'm just going to talk a bit today about a project I've been doing and will be doing for the next little while. All around democracy. But I'm specifically going to talk about the data. First of all who am I? I mean some of you know me, but not all. I work here at the ODI, I'm a developer, and I've been an engineer for sixteen or so years, something like that. I'm a parent, I've been a parent for six years and also I'm quite an activist in terms of trying to solve problems that are out there. And, I'm an optimist as well, so I'm trying to, I believe that we can make the world a better place, however naive that might be. An important point to note at the beginning of the talk is that while I work at the ODI, this is not about anything that I'm doing at the ODI, this is a completely separate project, and I'm sure Gavin would want me to state that with a nice big logo. [Laughter] So whatever I say here is not the opinion of the ODI.

So, we have a bit of a problem in this country in participation in democracy. So, turnout for General Elections is on a steady trend downwards, we have a bit of blip here at the end, but it's a lot lower than we'd like it to be. People are disengaged from their democracy, from their, you know, people feel powerless, really, to control it, and I'm gonna not get political generally, apart from to put that up. [Laughter] When people feel powerless, they tend to, you know, the easy answers tend to win and it becomes easy to convince people of certain things if you've got a nice easy answer. But I don't believe in that. I think we've got really big problems but we don't really have the political will to solve them, and the current institutions that we have don't really show any sign of dealing with those - the long term problems that we need to deal with.

And I tend to get annoyed about that kind of thing, and, a couple of years now, a year and a half ago, at Open Tech, somebody said something in one of the talks that in order to truly change the system, you have to become the system. This was in the context of GDS, who have taken all the amazing work that Civic Tech hackers and things have done outside government and taken it in and become part of the Civil Service and really engage with the system and change it from the inside. And I think that's true. At some point we have to stop sort of tinkering around the edges and become part of it. So, I wrote, again about a year and a half ago, I wrote a blog post in which I became a bit annoyed and ranty about the state of politics and things like that and my own personal opinions on what was going wrong and the fact that I didn't feel that there was anyone representing the kind of future that I want to see, the sort of long-term vision, the optimism, the taking us towards a better future, it all seemed to be about stepping backwards, about closing in, about worrying about us, not us as a species necessarily.

And, like I said, I'm a software developer so I tend to do things in a certain way, and one of those things is that I tend to throw software tools at problems. And so I said we should get started by setting up a GitHub repository, we should make some new options and we should start collaborating together on some policy for a better future, just as a kind of "Let's do something" and we'll stick all that together and work together. There's going to be a few technical aspects to this, it's about software development, open source and things like that so, I can answer questions about what these things particularly are at the end if there's stuff that doesn't make sense. But it's a way of working together using common open source tools really.

So, I then resolutely refused to do that because I have too many projects. Unfortunately, somebody else did which then gave me the permission to go and actually start adding to it. So, yeah, that was actually a bit of a lack of self control. And this is the thing. So we started working on, about a year and a quarter ago, a bunch of us started working on this thing which we called the OpenPolitics Manifesto and it's an experiment in collaborative manifesto building. So if you were going to write a manifesto from scratch, well, how would you do it? We thought well let's take an open source approach, we'll let anybody contribute, we'll try and reach consensus that way and we'll see what happens. I mean, as I say, I tend to use software ideas for things and we'll see what breaks, see how far I can push them.

And that produces this, this is the Open Politics Project website which is at openpolitics.org.uk and this is up there now, you can go and look at it. And it's a website which lists all of these manifesto ideas, all of these policy ideas, and there's a big edit button in corner just like you're on something like Wikipedia, and so anybody can actually go, create an account, edit the text to what they think it should say and then we reach a consensus and work together to produce that. So, it's at openpolitics.org.uk and that's sort of the open source manifesto, and it's a way of trying to apply the open source ideas.

So, let's talk about that project a little bit. Who can edit it? The idea is that anybody can edit, anybody can go along, create an account, it's no more difficult that maybe creating an account on, say Wikipedia, something like that, and you just type in the box, and the changes go into a queue and then they get reviewed and so on. It's a little bit more difficult than that, we're using tools that aren't really designed for non-technical uses, so we want to try and make that easier, I have a project in progress at the moment to do that, but we'll see. But certainly we've had contributions from people who've not used the system before, and who we haven't explained it all in great detail to, so that's good. So, basically anyone.

Who decides what to accept? So, surely anybody can come along and type in anything they want, somebody can go and say, let's bring back capital punishment, and they can. But, it won't necessarily get accepted because what we do is we allow the existing contributors to decide what goes in. And again, this is very similar to an open source model where anyone can make a submission into your bit of software, but in order to keep the whole bit of software working in the same way, going in the same direction, producing the thing you're supposed to, there's sort of guidance from that existing pool of contributors. So that's what we do here as well. You need to gain the approval of the existing contributors in order to get your change in, but once your change is in, obviously, you are one of those contributors and you immediately get a vote.

How do we get a vote? So, it's quite interesting, we've started out thinking that we've have a sort of straight majority, you know, for and against on each thing, but, with a long of tail of participation, you have a few people who're very active, and a lot of people who're right down the end and sort of contribute once and go away. And so that doesn't really work, because how do you decide what's needed for that majority? So we use a slightly different system which is a sort of kind of based on a blackballing approach which is where you need a certain number of people in order to accept change, but you only need one person to block a change. So if somebody thinks something's incompatible, or not written well enough, then they can block that change. Then we work through together, we discuss, to try and work that through and remove the block, and then get it accepted.

And the last thing is who counts the vote? And we have scripts that do this, we have robots, and all things should be run by robots if at all possible. [Laughter] When it involves simple matters like counting. So we have, this is the votebot, and it actually lists all the open votes, when people make changes they get added here, we count up how many votes for and against, and abstained, and so on, and so you can actually see everything that's going on, and this is all handled automatically and it gets updated in GitHub's build status flags, which are a way of GitHub knowing that something is okay to get merged in, to summarise. So we actually hook it up to that so when we go and look at the change on GitHub that says "Yes this has passed" or "No this hasn't passed," so we don't merge things by accident. At the moment it does take an admin to merge things, which is slightly annoying, I'd rather have the robot to do that as well, but life is short.

So it began about 15 months, and we ended up with over now 8,500 words of policy and more than 20 contributors - I counted this morning and it was 23. Which is not a massive amount but it's certainly a more diverse pool than the one or two that we started with. And so if we go onto May this year, and I made another unwise decision, which is that I thought, "Oh, hold on. Still nobody's representing what I want out of this, and I have this platform, so all right, let's go for it, I'm going to stand" and I decided to stand for Parliament next year. [Laughter] Um, I don't know what I'm doing. At all. And that's okay, and actually that's good, I think that career politicians and so on... I don't know. I don't really like it as a concept. I'd rather have people who knew other things, with experts to help them obviously, but I think this is okay. So, I have to stand. Does anybody know who that guy is? A few.

Important. It's Francis Maude, the current Minister for the Cabinet Office, what's the exact title, I can't remember. He's my local MP and so seeing as one thing is that I think people should stand where they live, I have to stand against him. There's a slightly awkward thing in that he was involved in some of the setting up of funding for the ODI, so technically... [Laughter] I kind of have a job, because yeah, anyway... [Laughter] It's not intentional, it's just where I live, I can't help it. But it does mean it's a very safe Conservative seat, so in a way that's kind of annoying as well.

And we set up a party, or we're setting up a party. But that's a different story that I'll talk about another time. With a logo and things, so we're setting up this party which is called Something New, and the idea is that it's forward looking, it's optimistic, it's trying to get those new ideas to take us forward, out there. And that's what I'm spending a lot of my time doing, outside work, at the moment. All the time doing outside of work really. Trying to get that message out there, trying to get more people involved, we've actually got two candidates now, not just me, and we're trying to get more and more people involved so, have a look at the website, if you think that's interesting. I'm not really going to talk about the policies, you can have a look, you can find that out for yourself, this isn't the place.

What I'm going to talk about that, no it's not, not going to talk about that, what I'm going to talk about is data, but one of the things with Something New is actually it's a different project to the OpenPolitics Project. So, it's adopted that open manifesto, but it doesn't mean nobody else can adopt it as well. You can stand as an Independent and use that manifesto, a bit like the sort of open source model of Wordpress being a bit of software that you can use, but wordpress.com providing it as a service, so we're kind of a political platform service. You can run your own if you want to, or you can use ours, except we don't charge you a monthly fee, and there's no SLA. [Laughter]

But what I am going to talk about is data. And, so, the various bits of data that I've looked at as I've been going through this process to work out what I'm doing, and to work out what I need to do. And so when I started on the journey I thought "right, well, okay." I'd already announced this obviously, I'd decided to just say I'm going to do then find everything out, I had to check after a few, I think it took me a couple of weeks, I phoned up the Electoral Commission to check I wasn't breaking the law, turns out I wasn't, so that's was good. [Laughter] It doesn't really matter what you say, all that stuff comes later on. So that's fine.

Yes, exactly, yes. So I start by looking at the results in my area. Now this is a very small graph but you should be able to see the general pattern. At the end here we have the Conservatives, followed by the Liberal Democrats and then a smattering of other people. We've got 77,000 voters in the constituency. That was quite interesting. And 72% turnout, which was higher than average, so they're quite passionate about voting Conservative. [Laughter] Again, that's going to be fun. I'm in this for the long haul, this isn't the only time. This is the first time and we're going to keep going and keep going, this is hopefully going to be something that builds.

So this is what I'm up against, which means to get my deposit back I need to get 2,800 votes, which... who knows? Could be fun. But yeah, that's the first bit of information. This is all open data, it's all under the OGL, and you can go find that anywhere. Next thing I went and looked up is the Electoral Commission data as well, who also collect data on what parties spend during their campaigns, and what parties raise as well. And so, I looked at the spending, and there's a very similar pattern. The party that spent the most, got the most votes. The party that spent second most, got the second most votes. There's a few little outliers. UKIP spent a bit more but didn't get as many votes as they should have done for their spending.

And, the one interesting thing here was that I'd always been confused why I'd never got a leaflet from the Green Party in the last election. I mean, they had a candidate, didn't they want me to know? No. No they didn't. They didn't spend anything, a grand total of 0, they just put the guy up on paper. They got 570 votes, but when I first announced, a couple people said "Why are you splitting the Green Party vote?!" [Laughter] I'm not too worried about that. So, yeah, that's quite interesting. There's some really interesting data in here, it's all broken down in terms of how it's spent, when it's spent, and I think there's quite an interesting site to be made around measuring spending against the number of votes gained, around whether people keep their deposits and what the categories are. It's quite fun to explore, but you do have to sort of be looking around the spreadsheets. I think there's a nice website there, but I'll maybe make it after the election. Or, after I've had a rest after the election.

The other thing the Electoral Commission collect is donation data. I haven't really gone into this yet, but I will be hopefully, sometime soon. There's almost certain to be some interesting things in there, though of course you have to declare donations over a certain size and say where they came from, but of course that doesn't get published, because it's private information. We'd like to publish our donations in public, but we'd have to see how that works with privacy and so on.

One interesting thing about the costs is that it actually meant I could go and look and work out what I needed to spend in order to get a leaflet through every door. I discovered the Royal Mail will send something around for you - you get as a parliamentary candidate, one free leaflet from the Royal Mail. Not a free leaflet, sorry, a free delivery from the Royal Mail. There are 47,000 households, so if I print 50,000 leaflets I can get a message to everybody, and that only costs £1,500, so that leaves me to think all those parties that couldn't raise that. You know, there were plenty of other small parties, the guy who was standing as an Independent for instance, who spent nowhere near that, what was he actually trying to do? I'm not entirely sure. But that gives a really good idea of what I need to spend to participate properly, and it seems surprisingly accessible. It seems disturbingly doable. At least to participate, not to actually win, obviously. I'm realistic about this - it's fine.

The next thing I started looking at was maps and there's some cool maps. Because I know I live in Horsham, but what's the constituency? What's is actually cover? I don't really know. So, the Ordnance Survey have a really nice website, they have an election maps website, that has all this information on it. You can go and search for the constituency and it shows all the boundaries of the electoral areas, right down to the Ward level. And so you can see, this is Horsham. Yeah, I didn't know we had this bit. That's was surprising. Taken a great big bite out of... this actually used part of this constituency, which is now a really weird shape. I can't help thinking there should be some more mathematical way of deciding these things. But anyway, that's what I've got to work with. It's quite rural, we've got a couple of... we've got Horsham and then a couple of other large villages I suppose, but mostly quite rural.

Obviously now I wanted this on my wall. I wanted a map on my wall that I could look at everyday and go "Ah..." [Laughter] Or something, I don't know. I don't actually do that, that'd be weird. [Laughter] So I did. I got a map. [Laughter] The interesting thing is, while the Ordnance Survey will print you a custom map, and it's this big and it's lovely, you can run your hands across it, you can feel the print where they've digitally printed it, it's beautiful, you can feel the topography. You can't get them to print an electoral boundary on it. So the first thing you have to do with you brand new map is draw on it in felt tip. [Laughter] Which took a while because some of the boundaries are really fiddly. In the end I just started going "Oh, it's about here." So there might be a few people on the edge who aren't technically on my map. And this bit's in green because I started using the wrong colour. [Laughter] So, I ruined the whole thing at the beginning really, but it's fine.

The other thing is within the constituency there are loads of Wards as well. This is a horrible image taken from my blog, sorry about the awful quality. These are the different wards within the constituency, the different areas that each vote separately. They sort of represent kind of where people live, and what I was thinking "Well I should go be in front of everyone, I should get to where they are, I should go and talk to them." Be at least out in all the wards at least once. And so I thought, "Right, what are they? And where shall I go first? And what order shall I do them in?" So I thought "Ha ha. Let's get some more data." So, again, looking at the OS election maps to find all those things, they use the same statistical geography identifiers as all the other government data, so it's linked for those other things, and it means you can go and look at the ONS data from the census, and it's all reported from the same areas, which surprised me that it was actually that easy. And it means that I've got this nice list of all these Wards and they vary massively in size. The biggest one, South Water, is more than three times the size of the smallest one, Rudgwick. Again, that's quite interesting. But it does tell me where I need to be focusing at which time and so on. I actually have a plan to visit all of them in the New Year and then to do the biggest ones this year as well, so I'd get to some of them twice. I have been doing that, getting the word's out's quite difficult, so I might now leave most of it till next year. But next year I will be going to visit in order, just about in order, until just near the election I'm hitting the larger ones. And so that was useful.

One thing's the most recent thing that I've done on the data front is party finances. So we've actually started spending some money, we don't exist as an organisation technically yet. But I've started spending money so we need to track it. And I thought, "Well, how does all this get published?" You know, what happens there? And I went to talk to Ian McGill who some of you may know from Spend Network and said "How do all the parties publish their data?" and he went, "Pah. They don't, really. Or it's in a complete mess." And then I said, "How would you want them to?" And he said "Well, if I could have my dream, I'd have all these things. I'd have buyers and unique identifiers for them. I'd have suppliers and unique identifiers for them. The different amounts obviously but then codes so I can them merge things together and find out what people are actually buying. And so we did that, we made this schemer, designed with Spend Network, so that they can pull it into their systems really easily and expose all of our spending through their systems. In a way that people can understand. So, there's a few interesting things in here. I was putting in things that I'd bought, and I needed a URI for me. So I thought, "Hang on. What is my URI? What is my unique identifier. I don't know." And I wasn't really sure what it was supposed to be. And then I read around a bit and read about, remembered about things like Open ID, and in the end I just made myself one up. So I created an identifier for myself, at id.floppy.org.uk. That is now me, that's my personal identifier. So that was an interesting thing.

But when you get into supply URIs, it's trying to uniquely identify the suppliers so people can see who you're paying, really who you're paying. It's quite interesting. Obviously it start with companies that you're buying things from. And this is where OpenCorporates comes into it's own. So they have company information all across the world, millions of companies now, and they each have a nice unique URI, that you can link to, that you can reference. And that was absolutely amazing. They don't have everything yet because I don't think they have Delaware companies, and I think the people we are buying our website services from I think are probably Delaware registered, so I couldn't link them yet. But, it's a good start. One of the things I had to buy from was a charity, and that got a bit more awkward. I went to look at the Charities Commission website and they had a search form. That was good. But it didn't really link to anything that I could link to. There wasn't anything that I could say, "Here's an identifier for this charity." But then someone pointed me at OpenCharities, which is trying to do the same kind of thing as OpenCorporates but for charities. So there is a page and a URL and the information. So we can actually link to them as well. That's great.

Unincorporated associations, I suspect these are one of the things, because I can't work out what they are, but who I've actually paid for the rental of the room, I suspect they just an association, and of course they don't have any identifier or exist in any clear way. So that's more awkward. I'm not sure how to deal with that one. And one of the most disturbing is parish councils. Now you'd think this would be easy, the LGA, the Local Government Association, publish standard URIs for different areas of government, for different councils, things like that. They go down to the district level, which is great. They don't go down to the parish level. But I've bought some things from parish councils and I want a URI. So I asked on Twitter and got the response of "Actually, I'm not sure there's even a list." And that's quite disturbing. People always sort of say, "We've worked with this huge company, and the first thing we had to do was make a list of their buildings" and I always think "How can that be true?" These are the building blocks, the lowest level of our democracy. They're real things, they have real power, they have real elections going on, and if we're going to publish election data we're going to have to reference them, and we don't even have a list. Somebody might need to do something about that. We might need to actually make a list of parish councils and give them URLs. There's probably an OpenCouncils project somewhere out there waiting to be done. So, yeah, I've given up on that. Because I've got no idea.

Codes things quite interesting, Ian introduced me to this thing called UNSPSC code which is the United Nations Standard Products and Services Code I think, which is an incredible thing. You can go on the website and search for, unspsc.org, you can search for the thing you want to represent, and they've got like, one thing I've found, there was twenty or so different slight preparations of the same type of food. And then there was one big category for internet, or something. [Laughter] It wasn't actually that bad. Things like finding a code that's up to date for buying this software as a service over the internet, it kind of wasn't, you had to be a bit creative with what you maneuvered it in. I think this thing is a little bit out of date, and more based around physical products, but we try and represent things in those.

And the other thing is the Electoral Commission reporting codes. So the Electoral Commission have an amazing scheme whereby they split, you have to report your spending in different categories, and there 6 of those, and they are A to F. Nice and descriptive. But essentially they cover different things like public meetings, travel, accommodation, stuff like that. And so when we're keeping our accounts, we try and include those codes as well, we can't always do it, because not everything kind of fits. There wasn't one for post it notes. [Laughter] But I put that into the public meetings one, because that's where I used them.

So we have our stuff in CSV funds, standard form. We published it, again, through GitHub. We have the CSVs uploaded into a GitHub repository where you have the CSV files themselves, and a data package, which is a JSON file which then points to the different CFVs and tells you what that schemer is for instance. So it tells you what all the fields means and where the files are so that you can just look at this one thing and discover everything else. We validate it with one of the tools that we created here at the ODI, called CSVLint. It actually checks the CSVs to make sure they're well formed, make sure they fit the right schemer and so on. So we validated with that. And it's published through GitHub pages. So GitHub, if you have some files stored in their repositories, if they're in a particular format it will take those files and turn them into web pages. In a fairly simple way through this thing called Jekyll. But it's all published with that and I'll show you in a second.

You can actually have a look at somethingnew.github.io/finances and everything's up there. And this is proper open data. I, last night, generated an open data certificate for this, which is another we've worked on here at the ODI, which says all of the information about where to get help with it, how often it's released, all that kind of thing. We've achieved the lowest level of open data certificate, the higher ones are quite hard to get. I'm only two questions off the next one up, which is really frustrating, I was really hoping to get on the next one. So I just need to go back and improve my data, and do a privacy impact assessment on myself. [Laughter] I'm not quite sure how to do that, I'll need to ask my boss about that. So, there's that.

And what we end up with, at the end, is a page, like that. And this is all automatically generated, all the stuff is in the CSV files and in that data package. And then when you come to actually look at the data, here's the data, rendered in the same way. So all this is turned into these html pages from that raw CSV, there's no other copy, and if you hit that download button over there you'll get that original CSV file. So, in a way, we've kind of rolled our own data portal. Which is pretty cool. But for an individual piece of data, so we don't need data portals anymore, we can just do it all this way. And so yeah, there's my map, there's a code, a UNSPSC code for maps, it didn't... Facebook advertising, there wasn't one I suppose. T-shirt printing was a hard one to find. [Laughter] Twenty quid this cost me. Anyway, that's basically it. That's the data adventures that I've had on this project. There'll be many more. But I'd just like to finish with a though about openness in general.

I think with technical revolutions, come social revolutions. If you look at agriculture, it allowed us to form complex civilisations, with the industrial revolution we moved into cities, if you look at the printing press, education became a thing that anybody could access. And we're at the beginning of this one. We're at the beginning of the revolution that's enabled by the web, by the fact we can all talk to each other, we can all work together. We don't have to send a representative on a horse to London any more. We don't have to do it that way. And there are ways we can move forward, move forward into a better future. And, what I'm trying to do is just a small part of that journey. We'll see how that turns out. And if you want... No hang on.

There's 174 days till the election. [Laughter] There's a Twitter bot called Days to the Election. Tweets every morning. Terrifying. [Laughter] I remember when that said 300, it wasn't that long ago. If you want any more information, I'm blogging about it on my own personal site. You can get me on Twitter and that's the party, somethingnew.org.uk and the manifesto's at openpolitics.org.uk. And that's it. Come and join us.