An Open Conversation With an Open Data Czar

Gavin Starks, the founding chief executive of the Open Data Institute.

Open Data Institute

If data is “the new oil,” then the Open Data Institute’s 5,000 square feet building in London’s trendy Shoreditch neighborhood is a huge oil field in the heart of the U.K. capital.

In the 18 months since it was established, the ODI — a hybrid non-profit data science school, research institute, startup accelerator, and business franchise — has been digging, mining, analyzing, and dealing in vast reserves of data. It has developed data training programs and courses for journalists, lawyers, executives, and practitioners, training over 300 people from 14 countries, including World Bank officials. It runs a corporate membership program with over 80 clients, who pay from £1000 to £50,000 per year. It conducts R&D on open data sets for internal research, public policy advocacy, social issues, and paying customers who want to know where the inefficiencies are, and where the potential for innovation is, particularly around financial data. And it’s built an international franchise in 13 countries, with each franchisee paying 10% of their ODI-related work to HQ. 56% percent of its London office’s income is from commercial work with revenues last year totaling £2 million. And in the spirit of open data that it champions, all of its financial and corporate data is made public on its web site.

The ODI’s director Gavin Starks, a successful tech veteran with a degree in Astronomy, believes there is a strong business case to be made for governments, organizations and businesses to make most of their data open – that is, share it online to unlock its potential. Individuals, Mr. Starks believes, will eventually own their private data and license it back to companies in exchange for money, goods or services.

He recently sat down with The Wall Street Journal for an interview.

Edited excerpts:

WSJ: How big is big data?

Mr. Starks: We’re in the 25th anniversary of the web this year. How many of today’s biggest companies were around back then? Not many. The amount of data that exists is doubling roughly every two years. If you project that forward, what does that look like in 25 years’ time? If you try envisaging open data as part of the whole data ecosystem, together with closed data — it will be bigger than the web is today.

WSJ: Is it reasonable for a company to take open data — a free resource — turn it around and sell it back to the people who created that data in the first place?

Mr. Starks: This is already a business model, in operation for decades, even pre-web. How did Google make money? It provides eyeballs, and it’s able to do this because it aggregates all the free and open data from around the world so that people can find it. Companies take data from the public sector. This isn’t just about taking data that we used to charge for and making it free. It’s about saying, where do we stimulate open innovation in this world, and how do we let a million flowers bloom out of that rather than just a handful?

WSJ: If we produce all this big data, and it’s open for others to use, have we lost control over it?

Mr. Starks: There’s an interesting transition happening from centralized to decentralized, from massive data centralization mainframes to the edge. The transition we’re going through now is from companies holding your data to you holding your data, and giving the companies a licensing fee for using it again. It’s going to take another decade, but that’s the curve we’re on. That then gives a sense of control back to the user.

WSJ: How open have you found governments and businesses to be with their data?

Mr. Starks: Governments, cities, and corporates have for years made decisions based on computer modeling. It’s a black box. Public services commission computer modeling agencies to delve into the data and work out what the impacts of different scenarios might be on a change in services. All of that process takes place in a black box. You can’t really do smart cities without open data.

WSJ: What’s the difference between open data and personal data, and is there not a danger that our personal data could be seen and used by others?

Mr. Starks: There is a difference between personal data and open data. Personal data is your individual health records, for example. The ODI would never classify that as open data, unless you had explicitly consented to that. Open Data is data that can be used by anyone for any purpose for free.

WSJ: What’s in it for private companies to open up their data?

Mr. Starks: Why does Google have an open API for maps? What’s the benefit to Google? For many years Google offered their maps for free, and now that company and its maps are huge. There’s a good share-a-like model, which is really powerful online. US-based Red Hat, Inc. a billion dollar software company, took the open source operating system Linux and built an open source community around it. They now sell open source software. GitHub Inc, a US-based website for software developers to share and collaborate on code, has millions of developers using its site. If you want to use their service for free you have to open source your software. As soon as you click a closed repository, they charge you. Github raised $100 million in their Series A on a $750 million valuation. There’s benefit for developers too: Github is one of the primary places any recruiter who knows what they’re doing in the software industry goes to hire people. Who would have thought that this would be one of the outcomes for setting up a code-sharing repository? The net benefit is that more people engage in developing products.

This article has been amended to reflect that Red Hat built an open source community around Linux, and that Github raised $100 million in their Series A on a $750 million valuation.