if writing is a muscle, this is my gym

Tag Archives: datadotgc.ca

This week, I’m pleased to announce the beta launch of Emitter.ca – a website for locating, exploring and assessing pollution in your community.

Why Emitter?

A few weeks ago, Nik Garkusha, Microsoft’s Open Source Strategy Lead and an open data advocate asked me: “are there any cool apps you could imagine developing using Canadian federal government open data?”

With NPRI I felt we could build an application that allowed people and communities to more clearly see who is polluting, and how much, in their communities could be quite powerful. A 220 chemicals that NPRI tracks isn’t, on its own, a helpful or useful to most Canadians.

We agreed to do something and set for ourselves three goals:

Create a powerful demonstration of how Canadian Federal open data can be used

Develop an application that makes data accessible and engaging to everyday Canadians and provides communities with a tool to better understand their immediate region or city

We’d like to refine our methodology. It would be great to have a methodology that was more sensitive to chemical types, combinations and other factors… Indeed, I know Matt would love to work with ENGOs or academics who might be able to help provide us with better score cards that can helps Canadians understand what the pollution near them means.

More features – I’d love to be able to include more datasets… like data on where tumours or asthama rates or even employment rates.

I’d LOVE to do mobile, to be able to show pollution data on a mobile app and even in using augmented reality.

Trends… once we get 2009 and/or earlier data we could begin to show trends in pollution rates by facility

plus much, much more…

Build on our work

Finally, we have made everything we’ve done open, our methodology is transparent, and anyone can access the data we used through an API that we share. Also, you can learn more about Emitter and how it came to be reading blog posts by the various developers involved.

Obviously the amazing group of people who made Emitter possible deserve an enormous thank you. I’d also like to thank the Open Lab at Microsoft Canada for contributing the resources that made this possible. We should also thank those who allowed us to build on their work, including Cory Horner’s Howdtheyvote.ca API for Electoral District boundaries we were able to use (why Elections Canada doesn’t offer this is beyond me and, frankly, is an embarrassment). Finally, it is important to acknowledge and thank the good people at Environment Canada who not only collected this data, but have the foresight and wisdom to share make it open. I hope we’ll see more of this.

In Sum

Will Emitter change the world? It’s hard to imagine. But hopefully it is a powerful example of what can happen when governments make their data open. That people will take that data and make it accessible in new and engaging ways.

I hope you’ll give it a spin and I look forward to sharing new features as they come out.

Update!

Since Yesterday Emitter.ca has picked up some media. Here are some of the links so far…

At a time when only a handful of cities had open data portals and the words “open data” were not being even talked about in Ottawa, we saw the site as a way to change the conversation and demonstrate the opportunity in front of us. Our goal was to:

Be an innovative platform that demonstrates how government should share data.

Create an incentive for government to share more data by showing ministers, public servants and the public which ministries are sharing data, and which are not.

Provide a useful service to citizens interested in open data by bringing it all the government data together into one place to both make it easier to find.

In every way we have achieved this goal. Today the conversation about open data in Ottawa is very different. I’ve demoed datadotgc.ca to the CIO’s of the federal government’s ministries and numerous other stakeholders and an increasing number of people understand that, in many important ways, the policy infrastructure for doing open data already exists since datadotgc.ca show the government is alreadydoing open data. More importantly, a growing number of people recognize it is the right thing to do.

In short, rather than just pointing to the 300 or so data sets that exist on federal government websites members may now upload datasets to datadotg.ca where we can both host them and offer custom APIs. This is made possible since we have integrated Microsoft’s Azure cloud-based Open Government Data Initiative into the website.

So what does this mean? It means people can add government data sets, or even mash up government data sets with their own data to create interest visualization, apps or websites. Already some of our core users have started to experiment with this feature. London Ontario’s transit data can be found on Datadotgc.ca making it easier to build mobile apps, and a group of us have taken Environment Canada’s facility pollution data, uploaded it and are using the API to create an interesting app we’ll be launching shortly.

So we are excited. We still have work to do around documentation and tracking some more federal data sets we know are out there but, we’ve gone live since nothing helps us develop like having users and people telling us what is, and isn’t working.

But more importantly, we want to go live to show Canadians and our governments, what is possible. Again, our goal remains the same – to push the government’s thinking about what is possible around open data by modeling what should be done. I believe we’ve already shifted the conversation – with luck, datadotgc.ca v2 will help shift it further and faster.

Finally, I can never thank our partners and volunteers enough for helping make this happen.

Richard Poynder has a wonderful (and detailed) post on his blog Open and Shut about the state of open data in the UK. Much of it covers arguments about why open data matters economically and democratically (the case I’ve been making as well). It is worthwhile reading for policy makers and engaged citizens.

There is however a much more important lesson buried in the article. It is in regard to the role of the Guardian newspaper.

As many of you know I’ve been advocating for Open Data at all levels of government, and in particular, at the federal level. This is why I and others created datadotgc.ca: If the government won’t create an open data portal, we’ll create one for them. The goal of course, was to show them that it already does open data, and that it could do a lot, lot more (there is a v2 of the site in the works that will offer some more, much cooler functionality coming soon).

What is fascinating about Poynder’s article is the important role the Guardian has played in bringing open data to the UK. Consider this small excerpt from his post.

For The Guardian the release of COINS marks a high point in a crusade it began in March 2006, when it published an article called “Give us back our crown jewels” and launched the Free Our Data campaign. Much has happened since. “What would have been unbelievable a few years ago is now commonplace,” The Guardian boasted when reporting on the release of COINS.

Why did The Guardian start the Free Our Data campaign? Because it wanted to draw attention to the fact that governments and government agencies have been using taxpayers’ money to create vast databases containing highly valuable information, and yet have made very little of this information publicly available.

The lesson here is that a national newspaper in the UK played a key role in pressuring a system of government virtually identical to our own (now also governed by a minority, conservative lead government) to release one of the most important data in its possession – the Combined Online Information System (COINS). This on top of postal codes and what we would find in Stats Canada’s databases.

All this leads me to ask one simple question. Where is the Globe and Mail? I’m not sure its editors have written a single piece calling for open data (am I wrong here?). Indeed, I’m not even sure the issue is on their radar. It certainly has done nothing close to launching a “national campaign.” They could do the Canadian economy, democracy and journalism and world of good. Open data can be championed by individual advocates such as myself but having a large media player repeatedly raising the issue, time and time again brings out the type of pressure few individuals can muster.

All this to say, if the Globe ever gets interested, I’m here. Happy to help.

We didn’t build libraries for a literate citizenry. We built libraries to help citizens become literate. Today we build open data portals not because we have public policy literate citizens, we build them so that citizens may become literate in public policy.

Yesterday, in a brilliant article on The Guardian website, Charles Arthur argued that a global flood of government data is being opened up to the public (sadly, not in Canada) and that we are going to need an army of people to make it understandable.

I agree. We need a data-literate citizenry, not just a small elite of hackers and policy wonks. And the best way to cultivate that broad-based literacy is not to release in small or measured quantities, but to flood us with data. To provide thousands of niches that will interest people in learning, playing and working with open data. But more than this we also need to think about cultivating communities where citizens can exchange ideas as well as involve educators to help provide support and increase people’s ability to move up the learning curve.

Interestingly, this is not new territory. We have a model for how to make this happen – one from which we can draw lessons or foresee problems. What model? Consider a process similar in scale and scope that happened just over a century ago: the library revolution.

In the late 19th and early 20th century, governments and philanthropists across the western world suddenly became obsessed with building libraries – lots of them. Everything from large ones like the New York Main Library to small ones like the thousands of tiny, one-room county libraries that dot the countryside. Big or small, these institutions quickly became treasured and important parts of any city or town. At the core of this project was that literate citizens would be both more productive and more effective citizens.

But like open data, this project was not without controversy. It is worth noting that at the time some people argued libraries were dangerous. Libraries could spread subversive ideas – especially about sexuality and politics – and that giving citizens access to knowledge out of context would render them dangerous to themselves and society at large. Remember, ideas are a dangerous thing. And libraries are full of them.

…for a period of time, censorship was a key responsibility of the librarian, along with trying to persuade the public that reading was not frivolous or harmful… many were concerned that this money could have been used elsewhere to better serve people. Lord Rodenberry claimed that “reading would destroy independent thinking.” Librarians were also coming under attack because they could not prove that libraries were having any impact on reducing crime, improving happiness, or assisting economic growth, areas of keen importance during this period… (Geller, 1984)

Today when I talk to public servants, think tank leaders and others, most grasp the benefit of “open data” – of having the government sharing the data it collects. A few however, talk about the problem of just handing data over to the public. Some questions whether the activity is “frivolous or harmful.” They ask “what will people do with the data?” “They might misunderstand it” or “They might misuse it.” Ultimately they argue we can only release this data “in context”. Data after all, is a dangerous thing. And governments produce a lot of it.

As in the 19th century, these arguments must not prevail. Indeed, we must do the exact opposite. Charges of “frivolousness” or a desire to ensure data is only released “in context” are code to obstruct or shape data portals to ensure that they only support what public institutions or politicians deem “acceptable”. Again, we need a flood of data, not only because it is good for democracy and government, but because it increases the likelihood of more people taking interest and becoming literate.

It is worth remembering: We didn’t build libraries for an already literate citizenry. We built libraries to help citizens become literate. Today we build open data portals not because we have a data or public policy literate citizenry, we build them so that citizens may become literate in data, visualization, coding and public policy.

This is why coders in cities like Vancouver and Ottawa come together for open data hackathons, to share ideas and skills on how to use and engage with open data.

But smart governments should not only rely on small groups of developers to make use of open data. Forward-looking governments – those that want an engaged citizenry, a 21st-century workforce and a creative, knowledge-based economy in their jurisdiction – will reach out to universities, colleges and schools and encourage them to get their students using, visualizing, writing about and generally engaging with open data. Not only to help others understand its significance, but to foster a sense of empowerment and sense of opportunity among a generation that could create the public policy hacks that will save lives, make public resources more efficient and effective and make communities more livable and fun. The recent paper published by the University of British Columbia students who used open data to analyze graffiti trends in Vancouver is a perfect early example of this phenomenon.

When we think of libraries, we often just think of a building with books. But 19th century mattered not only because they had books, but because they offered literacy programs, books clubs, and other resources to help citizens become literate and thus, more engaged and productive. Open data catalogs need to learn the same lesson. While they won’t require the same centralized and costly approach as the 19th century, governments that help foster communities around open data, that encourage their school system to use it as a basis for teaching, and then support their citizens’ efforts to write and suggest their own public policy ideas will, I suspect, benefit from happier and more engaged citizens, along with better services and stronger economies.

So what is your government/university/community doing to create its citizen army of open data analysts?

Yesterday I was part of a panel at the CIO Summit, a conference for CIO’s of the various ministries of the Canadian Government. There was lots more I would have liked to have shared with the group, so I’ve attached some links here as a follow up for those in (and not in) attendance, to help flesh out some of my thoughts:

1. Doing mini-GCPEDIAcamps or WikiCamps

So what is a “camp“? Check out Wikipedia! “A term commonly used in the titles of technology-related unconferences, such as Foo Camp and BarCamp.” In short, it is an informal gathering of people who share a common interest who gather to share best practices or talk about the shared interest.

There is interest in GCPEDIA across the public service but many people aren’t sure how to use it (in both the technical and social sense). So let’s start holding small mini-conferences to help socialize how people can use GCPEDIA and help get them online. Find a champion, organize informally, do it at lunch, make it informal, and ensure there are connected laptops or computers on hand. And do it more than once! Above all, a network peer-based platform, requires a networked learning structure.

As I mentioned, a community of people have launched datadotgc.ca. If you are the CIO of a ministry that has structured data sets (e.g. CVS, excel spreadsheets, KML, SHAPE files, things that users can download and play with, so not PDFs!) drop the URLs of their locations into an email or spreadsheet and send it to me! I would love to have your ministry well represented on the front page graph on datadotgc.ca.

4. Let’s get more people involved in helping Government websites work (for citizens)

During the conference I offered to help organize some Government DesignCamps to help ensure that CLF 3 (or whatever the next iteration will be called) helps Canadians navigate government websites. There are people out there who would offer up some free advice – sometimes out of love, sometimes out of frustration – that regardless of their motivation could be deeply, deeply helpful. Canada has a rich and talented design community including people like this – why not tap into it? More importantly, it is a model that has worked when done right. This situation is very similar to the genesis of the original TransitCamp in Toronto.

The fact is, if you aren’t even looking at open source solutions you are screen out part of your vendor ecosystem and failing in your fiduciary duty to engage in all options to deliver value to tax payers. Right now Government’s only seem to know how to pay LOTS of money for IT. You can’t afford to do that anymore. GCPEDIA is available to every government employee, has 15,000 users today and could easily scale to 300,000 (we know it can scale because Wikipedia is way, way bigger). All this for the cost of $60K in consulting fees and $1.5M in staff time. That is cheap. Disruptively cheap. Any alternative would have cost you $20M+ and, if scaled, I suspect $60M+.

For regular readers of my blog I promise not to talk too much about datadotgc.ca here at eaves.ca. I am going to today since I’ve received a number of requests from people asking if and how they could help so I wanted to lay out what is on my mind at the moment, and if people had time/capacity, how they could help.

The Context

Next wednesday I’ll be doing a small presentation to all the CIO’s of the federal public service. During that presentation I’d like to either go live to datadotgc.ca or at least show an up to date screen shot (if there is no internet). It would be great to have more data sets in the site at that time so I can impress upon this group a) how little machine readable data there is in Canada versus other countries (especially the UK and US) and b) show them what an effective open data portal should look like.

So what are the datadotgc.ca priorities at this moment?

1. Get more data sets listed in datadotgc.ca

There is a list of machine readable data sets known to exist in the federal government that has been posted here. For coders – the CKAN API is relatively straight forward to use. There is also an import script that can allow one to bulk import data lists into datadotgc.ca, as well as instructions posted here in the datadotgc.ca google group.

2. Better document how to bulk add data sets.

While the above documentation is good, I’d love to have some documentation and scripts that are specific to datadotgc.ca/ca.ckan.net. I’m hoping to recruit some help with this tonight at the Open Data hackathon, but if you are interested, please let me know.

3. Build better tools

One idea I had that I have shared with Steve T. is to develop a jet-pack add on for Firefox that, when you are on a government page scans for links to certain file types (SHAPE, XLS, etc…) and then let’s you know if they are already in datadotgc.ca. If not, it would provide a form to “liberate the dataset” without forcing the user to leave the government website. This would make it easier for non-developers to add datasets to datadotgc.ca.

4. Locate machine readable data sets

Of course, we can only add to datadotgc.ca data sets that we know about, so if you know about a machine readable datasets that could be liberated, please add it! If there are many and you don’t know how ping me, or add it directly to the list in the datadotgc.ca google group.