Leeds Data Millhttp://staging.leedsdatamill.org
Open Data for LeedsMon, 13 Oct 2014 15:08:29 +0000en-GBhourly1What is an Open Data Project? (Part 3)http://staging.leedsdatamill.org/guest-blog/open-data-project-part-3/
http://staging.leedsdatamill.org/guest-blog/open-data-project-part-3/#commentsMon, 29 Sep 2014 10:04:41 +0000http://www.leedsdatamill.org/?post_type=blog&p=318In part 1 and part 2 we looked at why and how you might approach an open data project. So what exactly is this data that we’re talking about? What does it mean?

Data and Wisdom

“They say there’s enough data in the world to make people confused, but not enough to make them wise.”Louis Cyphre [misquoted] from movie Angel Heart

There’s a way of thinking about data’s place in the world, and it’s expressed like this:Data ⇢Information⇢Knowledge⇢Wisdom

Here’s a true-life example to explain the difference between these things:

I’m standing at the bus-stop trying to work out when my next bus is, looking at the timetables – lots of data! I find the bus service I need, and the next times for my bus – this is more like information, data that is relevant to me here and now, that I can use to decide what to do next.

I remember that my last bus home is 11:00pm (this is knowledge, gained from past experience, memory of getting the last bus). But it’s actually 11:30… oh, if I had any wisdom, I wouldn’t have stayed in the pub so long that I didn’t now have to find a taxi…

Let’s take that analogy and apply it to the complex systems and processes within your own organisation. If you’re trying to understand how all the IT systems within your business operate and link together, or are simply looking for some interesting data to publish on Leeds Data Mill, it’s unlikely that you’ll make first contact with the data itself. Whether it’s in databases running on servers or in files saved in folders, the data is initially completely inaccessible to you.

To get to the data, what you’re really doing is following the data ⇢ knowledge trail, but in reverse:

Knowledge = people that use the systems every dayInformation = what the IT systems provideData = databases and files

This is why open data is all about people; data on its own doesn’t help you find or understand it, it just is. Your route to the data will be through the people who receive it, use it, view it.

As you follow the rabbit hole towards the data, you’ll be picking up important information about what the data means, even before you’ve seen it. For example:

who is responsible for the data?

where is it stored?

how much of it is there?

does it contain personal information about people?

This information is metadata, data about data. (To be honest, the academic distinction between metadata, data and information is not something I think about too much. All that matters is ‘does it seem interesting and useful? Does the metadata help me organise and explain the data that it describes?’)

Metadata is data! – in other words, it needs to treated like data, in a structured way. There are a few standard ways of describing data (e.g. http://dublincore.org/) and the reason why such standards are necessary is to help people share and link their various data with each other in a consistent way, and this is key to making Open Data a workable idea. But such standards won’t necessarily help you to make sense of your own data, that’s specific to your organisation. Remember, the point is to firstly make the data understandable and useful within your own organisation, and then to anyone looking at it should you publish it. So there are really 3 sets of metadata:

Describing what the data means to your organisation

Describing what the data that you publish means to people outside your organisation

Describing how the data is published and what its context is (e.g. ‘data for Aug 2014’)

What I’m really saying is, it’s all about the data, but the data is only the start. The real problem is in revealing the hidden relationships between things, between people, places and services… that discussion is for another time maybe!

@johnmaeda: “It’s not about choosing X versus Y. It’s about choosing the right relationship between X and Y.” —@jshefrin

For those of you just beginning your open data trip, I hope these blog posts will inspire you to explore further. Don’t forget, there is a wonderful community here in Leeds and WY to support you!

In the first part of this post we explored why companies should release open data. In this part, I’ll share some thoughts about how you might scope and plan an open data project.

Scope and Possibilities

One thing to consider is that an open data project may represent an opportunity to do more than simply publish some data. For example:

take advantage of the overlap between publishing open data and how you respond to the requirements of the Freedom of Information Act;

whilst working out how your databases and applications fit together, in terms of the data they store and the data flows between them, you could also review your security policies and access control mechanisms to provide assurance that all is well or to flag up areas which need improvement;

you’ll gain an understanding of how effectively data is currently being utilised within your organisation, which could lead to projects to improve internal Management Information, or flag up some opportunities for process-improvement.

Once you’ve agreed the project’s scope, you will want to create a plan for the work. By all means start with some tasks and milestones but before you fire up MS Project or your organisation’s de facto reporting system, step back and consider exactly what your deliverables are. What is the true nature of the project? What needs to happen after it ends, to keep the open data agenda alive in your organisation?

The Nature of the Beast

You could treat the project as a feasibility study, to be followed by a review and planning next-steps. There may be benefit in starting the project in a slightly underground way, without too much fanfare. This way, your first meetings with your colleagues who create or use the data will help inform your approach when you decide to give the project a higher profile.

However ambitious the scope, the truth is that the first couple of months of the project will be a voyage into the unknown – how much so will depend upon how self-documenting the organisation already is and how centralised your systems are. So be careful about what you are committing to before you sufficiently understand the nature of the beast. Don’t assume that what initially sounds straightforward (find → understand → publish → maintain data) will be easy, until you’ve actually worked with a team to understand their systems, processes, business events and data.

Finding a Level

If you’re starting with a high-level system wide overview, you’ll have to balance the thoroughness of the analysis with the number and complexity of the systems in your organisation. If there are many interlinked systems and teams, understanding the key data flows between systems is essential to understanding the context and ownership of the data.

You start at a high level in order to help you prioritise where best to focus resources on more detailed analysis or to find the best place to start looking at how to extract suitable open data. It is often hard to stay at a high-level without getting drawn into the minutiae or following up interesting leads. It all depends on your stated approach and the scope of the project how you monitor progress and react to difficulties or slippage. What is certain is that once you start, you’ll be under pressure to find and publish some open data as soon as possible.

Sprints

It can be helpful to think of an open data initiative as a programme or a set of sub-projects, rather than a single self-contained series of tasks with inter-dependencies all neatly arranged between a start and a finish date. Some aspects will be linear (e.g organise and run a workshop), others will repeat during the life of the project and then continue forever in some other form, perhaps becoming embedded in the day to day operations (e.g. analyse a system, find and publish data, maintain it). If you start by looking at the project in this modular manner it can help avoid creating overly complex and hard to maintain project plans.

Another way of handling the progression from high to detailed level is to create small, self-contained mini-projects or ‘sprints’. The starting point of a sprint might be “I can see that there is good data in here, some work is required to extract it” – with the deliverable being the published data. Identify potential sprints as you go along, and deal with them in a prioritised way. Be careful not to spread yourself or team across too many sprints at once, since that defeats the point of a sprint as something ‘do-able’ in a relatively short space of time (weeks rather than months). Occasionally you may hear about some new data and think that you might be able to publish it relatively easily, in which case you may decide to put other things on hold for a while to focus on that.

I hope this post has given you some ideas about how to scope and run an open data project. In the final part of this article, we’ll get into the data – what it is and what to do with it!

Mike is a Business Analyst at West Yorkshire Combined Authority (WYCA) which is the official government agency for transport across West Yorkshire. Get in touch with him on twitter @dotlineform

All views expressed in this article are Mike’s and do not necessarily reflect those of his employer WYCA.

]]>http://staging.leedsdatamill.org/guest-blog/get-open-data-project-going/feed/0Opening Leeds School of Datahttp://staging.leedsdatamill.org/guest-blog/opening-leeds-school-data/
http://staging.leedsdatamill.org/guest-blog/opening-leeds-school-data/#commentsFri, 15 Aug 2014 11:19:07 +0000http://www.leedsdatamill.org/?post_type=blog&p=325When I moved to secondary school, in my first days of my first year of secondary school, I was introduced to the formal concept of ‘science’. I was worried, scared a little even. Lessons in ‘physics’, ‘chemistry’, and – gulp – ‘biology’ (ie. cutting up frogs and sex lessons). In a new environment with new teachers on new subjects. It was something unknown.

Which sounds a bit daft now with hindsight. Science is our race’s understanding of our world. Science is all around us. But back in 1986, when I was 11? I didn’t see that. And the alien nature of it all added to the bewilderment.

When you’re growing up and someone says “You will learn this”, especially when you are younger and at school, you don’t tend to get the chance to ask why or see its relevance. Nor any choice whether you will or won’t do it. You just do it.

You are taught whether you want to learn it or not, by at least attending – let alone participating – in the lessons. And you put a lot of faith in the teacher that what they are imparting is ‘right’ not ‘wrong’.

As we leave formal education this sort of thing tends to still happen. You find yourself in ‘lessons’ being ‘taught’ by people who may or may not be qualified to do so. But we just accept it as part of the “university of life”.

Over the past year or so I have been involved in many projects and talks around the ideas of data, open data and big data.

Some of the talk is, admittedly, complex and complicated, and actually needs to be.

But there’s been times when I’ve read, listened, and watched as people unnecessarily present “data” as if they are trying to induce how my 11-year-old self once felt: Dumfoundment.

Things – concepts – can be as simple or complex as you view them to be. Inherently you need to start with a simple understanding of any concept, then build upon that.

The initial concepts of data, open data and big data aren’t that complex, aren’t that complicated.

Taking a ‘zero prior knowledge’ start, the aim is to raise understanding and awareness of data through things the students use and connect with every day. And to do it at a comfortable pace, in a friendly environment, in just four 90 minute lessons over four weeks. We know you’ll come away with some mad data skillz!

We’ve deliberately made the cost of the course as cheap as possible to make it as accessible as possible: A tenner for four lessons works out at £2.50 a class.

We’ve also set the course up so it can be easily refined and reused. We’ve already plans to run the course again later, both in its four weeks form and as a day course. (Do get in touch if you are interested in doing the course at another time.)

The course is looking to extend the existing prevalent sharing approach/mentality that a number of other Leeds events/forums have at their heart, like the Leeds Digital Lunch, Forefront, and Hey Stac!, something the area should be proud of.

Personally it’s a great chance to do something in the area I live in while I am increasingly spending more time working away (and therefore less time actually in Leeds/Bradford). The Mill’s mission to provide as much open data from the city of Leeds as possible is one I am behind, and working to educate people about data and its possibilities is a massive part of that.

]]>http://staging.leedsdatamill.org/guest-blog/opening-leeds-school-data/feed/0What is an Open Data Project?http://staging.leedsdatamill.org/guest-blog/open-data-project/
http://staging.leedsdatamill.org/guest-blog/open-data-project/#commentsFri, 08 Aug 2014 11:02:18 +0000http://www.leedsdatamill.org/?post_type=blog&p=289Data is not a thing in itself, it always has context. Think of data as a parcel that has come through the post. Inside the parcel is ‘data’, but the parcel itself is also data – you’ve only got the data inside because it’s been created, packaged, and sent to you. So finding and understanding data is not just about locating a database on a server, or a file in a folder, it’s about understanding the people and the processes that give that data meaning and relevance.

Making data ‘open’ is about understanding its story; how it came to exist and what it represents, and then enabling other people to write the next chapter.

There’s a 3-piece jigsaw that businesses need to piece together when starting an open data initiative:

Why do it?

How do we do it?

What to do with the data!

This 3-part post is designed to explore this puzzle, highlight some of the challenges and share solutions for overcoming the roadblocks. So let’s start at the beginning, why do it at all?

My 5 year old nephew has an intuitive understanding of data that is already perfectly in tune with the Information Age. He doesn’t see the objects in the Minecraft game as the complex but finite data structures that they actually are, he simply sees them as opportunities to make something exciting happen. It’s a wonderful thing to watch him play and a great analogy for what Open Data is all about – seeing beyond the data and exploring what can be done with it.

When we grow older, we lose some sense of the present moment, because we have so many memories and we are ever more conscious of the reducing future. So it’s understandable that the organisations we work in also have a similar sentimentality about the past and hopes for the future. It often takes something quietly disruptive to bring us back to the present moment, to make us think, what is the purpose, what is our relationship with the community in which we live, and the wider world?

“While it may be difficult to change the world, it is always possible to change the way we look at it.”
~ Matthieu Ricard
#mindfulness

Here’s some completely unsubstantiated facts for you: most organisations have more data than they know, and they use 1% of it. Of that 1%, only 10% is actually business-critical, classified, confidential data. The rest could probably be discarded, along with the processes that keep feeding it. I’d agree with the sentiment but as for discarding data, no! There is good data in your business, the fact that you might not be analysing or reporting on it is another thing altogether. An open data project will bring such things to the surface, it will highlight many opportunities for improving your internal processes in a way that no business-process driven project has ever managed to achieve. Why? Because open data is about realising potential, fostering new ideas and collaborations. So let’s first understand our data and think about how we might make better use of it.

But why share it at all? Because our data might be useful to other people who want to understand and improve the community. Improving the decision making and influence of the community is a good thing, no? We as individuals are part of this community, our colleagues, customers, suppliers and stakeholders are too. When a business (and its data) becomes separate from the people it is about and serves, it becomes dysfunctional. Sharing data represents more than what it is, it’s a statement of engagement, trust and inclusiveness. Get your head around that and you already have an open data project!

Now we have the why, we are on a journey to figure out how. The next part of this article will explore the wonderful world of project management and provide some examples of best practice to get your open data project up and running!

Mike is a Business Analyst at West Yorkshire Combined Authority (WYCA) which is the official government agency for transport across West Yorkshire. Get in touch with him on twitter @dotlineform

All views expressed in this article are Mike’s and do not necessarily reflect those of his employer WYCA.

]]>http://staging.leedsdatamill.org/guest-blog/open-data-project/feed/0Leeds Dashboardhttp://staging.leedsdatamill.org/projects/leeds-dashboard/
http://staging.leedsdatamill.org/projects/leeds-dashboard/#commentsTue, 15 Jul 2014 15:36:01 +0000http://www.leedsdatamill.org/?post_type=blog&p=275Today (Tuesday 15th July) Cllr Yeadon and I are launching our very own ‘Leeds Dashboard’. The aim is to make Open Data understandable to all by creating something very visual that helps explain our city.

We’re launching Leeds Dashboard with a number of initial ‘widgets’ that have been created to demonstrate its capabilities – from daily footfall figures in the city centre, to parking fines, and planning application approval rates. As you’ll see, the Dashboard works on desktop, tablet and mobile devices and will also be appearing on big screens across the city.

Rather than creating a single one-time snap shot of the city, I wanted the dashboard to be a place where other people could create their own widgets and have them included. The vision is to create something that is truly “By the city, for the city” with developers creating widgets from data that is interesting to them. You may notice that each of the dashboard widgets has a “created by…” tag underneath it which links through to the creators design portfolio / twitter account / personal website – this is so people are recognised for their work – something that was really important to me. There is also a section on the dashboard that provides design and development guidelines, along with icons that could be used to create the widgets.

As more and more Open Data is released on this site, richer analysis can be done as datasets can be mashed together to understand the city in new ways.

To back this up we have planning to host a range of Data Dive events where people and teams are invited to examine datasets under a different topic each month. Starting with “Health” in September, “Sport” in October, “Transport” in November, and “Energy” in January we hope to be able to work with companies and organisations to release information as Open Data so others can use it to create widgets and insight like never before. I’ll be blogging about this in more detail and releasing event details over the coming weeks.

The long-term plan is for the Dashboard to provide a comprehensive live feed of what’s happening in the city and, eventually, people will be able to personalise their Dashboard with the widgets that are most relevant to their interests. Once we have enough hyper-local data, we could even create village dashboards. We will be continuing to work on the dashboard as we believe in it and want this to be a useful dashboard that is truly “By the city, for the city”.

So, into this brave new world we go, where we explain Open Data to everyone, not just Open Data geeks like me (and you!)

]]>http://staging.leedsdatamill.org/projects/leeds-dashboard/feed/0150 days down t’millhttp://staging.leedsdatamill.org/project-updates/150-working-days-at-leeds-data-mill/
http://staging.leedsdatamill.org/project-updates/150-working-days-at-leeds-data-mill/#commentsSun, 06 Jul 2014 08:03:24 +0000http://www.leedsdatamill.org/?post_type=blog&p=253I realised it’s been 150 working days since Leeds Data Mill was born, so thought it’d be a good chance to reflect on the work that’s been happening, and highlight a few things that are going to be released over the next few weeks…

]]>http://staging.leedsdatamill.org/project-updates/150-working-days-at-leeds-data-mill/feed/0What happened at Health in Numbers?http://staging.leedsdatamill.org/events/what-happened-at-health-in-numbers/
http://staging.leedsdatamill.org/events/what-happened-at-health-in-numbers/#commentsFri, 04 Jul 2014 17:11:55 +0000http://www.leedsdatamill.org/?post_type=blog&p=247[<a href="//storify.com/LeedsDataMill/health-in-numbers" target="_blank">View the story "Health in Numbers" on Storify</a>]
]]>http://staging.leedsdatamill.org/events/what-happened-at-health-in-numbers/feed/0Can open data encourage civic participation?http://staging.leedsdatamill.org/projects/open-data-and-civic-participation/
http://staging.leedsdatamill.org/projects/open-data-and-civic-participation/#commentsTue, 24 Jun 2014 17:53:00 +0000http://www.leedsdatamill.org/?post_type=blog&p=238In June, independent game developers Wetgenes previewed the alpha version of a game they are developing for the Leeds Data Mill. The game – #LeedsArtCrawl is one of the several experiments we are carrying out at The Data Mill to encourage civic participation using the medium of open data.

Why create an art crawl?

The idea for #LeedsArtsCrawl emerged after a series of discussions with Kriss and Shi, who are the people behind Wetgenes. I first met them at the Leeds Data Mill hack event held at Leeds City Museum in March. Over a course of two days, they conceived and developed a game using public funerals data. It was a remarkable effort in using open data to raise awareness and encourage discussion about a serious issue.

Wetgenes describe themselves as a feral games developer on their website. I was intrigued by this description and curious about their approach. So one afternoon we met in Leeds to talk about ideas and possible projects. At the time they were working on a very interesting gamification project for a major initiative in Mexico City and I was keen for them to apply a similar approach for a Leeds Data Mill experiment.

We decided to focus on public art as Leeds has a strong link with sculpture. Prominent artists Barbara Hepworth and Henry Moore trained at the Leeds School of Art. The city is part of the Yorkshire Sculpture Triangle and is considering a bid to become the European Capital of Culture in 2023.

There are 200+ public artworks distributed across the city. Some are very visible, others, not quite so. They also vary considerably in medium and style. So our goal is to create a comprehensive dataset about the public art offering across the city.

Can we crowdsource data?

The image at the beginning of this post hints at the strong social element of the experiment, which involves selfies and hashtags. These are just two examples of how we are constantly generating and sharing data based on our actions, behaviours and observations. Some might consider this to be a source of white noise, but there is a wealth of information hidden in this data stream. Take a look at this article on Fastcompany, which is just one of several projects across the world that is trying to crowdsource data through active public participation.

If we are able to crowdsource relevant and useful data, does that help us create a useful blueprint for civic participation and decision making? Furthermore, can such interventions help raise the profile of the city’s cultural offering, not just amongst its residents, but also nationally and internationally? This ties into tourism, public policy and economic development.

These are just some of the big questions we are keen on explore. But in keeping with the rapid prototyping approach we follow at The Data Mill, we will start small, build, measure and learn.

How to get involved

Turn on location on your phone (We need to know where the public art is.)

Take a selfie or photo in front of a public art in Leeds

Tweet the photo using #leedsartcrawl as the hashtag so we can add you to the gallery

View other peoples photos

Abhay Adhikari (@gopaldass)
I am interested in the context & values that define our Digital Identities. I work with various to define the#FutureofWork. Blog for The Guardian.

]]>http://staging.leedsdatamill.org/projects/open-data-and-civic-participation/feed/0Where should I live in Leeds?http://staging.leedsdatamill.org/guest-blog/where-to-live-in-leeds/
http://staging.leedsdatamill.org/guest-blog/where-to-live-in-leeds/#commentsFri, 20 Jun 2014 08:52:58 +0000http://www.leedsdatamill.org/?post_type=blog&p=227I was involved in a workshop last week with colleagues who are involved in responding to Freedom of Information (FoI) requests. Changes to the FoI Act now state that if a dataset is requested, we must provide the information in an open and machine readable format to allow re-use. Not only that, we should also publish it on our website as the chances are that if one person has shown interest in it, others may be interested too.

The challenge when planning this workshop was how to win over a potentially sceptical audience. They can almost certainly see the benefits of publishing data if it means a reduction in FoI requests, but what about just publishing it because it’s a good thing to do? I needed something tangible, an example which people could relate to.

Now I’m no techie and therefore my example was always going to be crude to say the least! I sourced some primary school location data from the Education Leeds website. I then found some average house price information from Zoopla and created a heat map using Google Fusion tables. What did this show us? Well, it pin-pointed the location of primary schools and where the most expensive areas to live were in relation to those schools. This I might add was the extent of my very crude example!

However, what if then someone with much more technical knowledge than me, could also source data such as parks, transport information, events, culture, crime statistics, places to eat and drink? – I could go on. So, I expanded on my example. Imagine a ‘Where should I live in Leeds?’ app. Type in your budget, what your hobbies are, what you like to eat, your family circumstances, where you work etc, and an app like this could provide you with the best location in Leeds for you to live in respect of what you want out of life and getting the best work-life balance. An app to improve your quality of life! In the council, we hold much of the data which would enable this type of app to become a reality.

My second example was a ‘real-life’ one, where a member of the public had mapped child poverty data onto a map of Leeds across a number of years. It is this kind of analysis which the council itself could find really useful and indeed could help inform decision making in the future. If only more of this data were to be made available for people to analyse.

What we need to do now in the council is to see the bigger picture. Opening up our data is great in being open and transparent and reducing officer time in responding to FoI requests. It’s even better however, when it’s published in an open and machine-readable format to allow people to re-use it, analyse it, and create really innovative solutions to everyday problems we all encounter in life. I need help in encouraging others across the council realise the value of opening up our data and being transparent, so if there are any ‘real-life’ examples out there of what has been done with publicly sourced data that can assist me being more persuasive, that’d be great.

—

Blog post by

Stephen Blackburn (@StevieBYorks)
Stephen is the Senior Information Governance Officer within Leeds City Council – leading on Open Data