Digital India: What India’s Open Data Program could learn from USAFacts.org

The importance of Data is growing not just in the private sector but also in the Government. Data can help in decision making. In fact, Data not only helps the citizens but the governments too can benefit by making better-informed decisions. Let us touch on what the 3 sites do – Data.gov [US Govt], USAFacts.org [US Non-Govt owned] & Data.gov.in [Indian Govt].

Any open data platform from a government must allow their departments to provide data & make the data available for others to analyse. Such platforms don’t host the data themselves but rather aggregates metadata about open data resources in one centralized location.

Open Data Day – a day devoted to encouraging governments to make public data freely available in machine-readable formats under open licenses

Data.Gov – Open Data from the US Government

It must be noted that the US government already has a superb Open Data site Data.gov. The site has data, tools, and resources to conduct research, develop web and mobile applications, design data visualization.

Under the terms of the 2013 Federal Open Data Policy, structure,-generated government data is required to be made available in open, machine-readable formats while continuing to ensure privacy and security.

Data.gov is built on WordPress & CKAN (world’s largest open source data portal platform). Data.gov source code is available on GitHub].

USAFacts.org – the Balance Sheet of US Government

Former Microsoft CEO Steve Ballmer launched USAFacts [see video], which has detailed statistical reports on local, state and federal governments. He rightly said in the era of fake news “numbers” speak for itself, it shows how the country (US) is being run. You can listen to Steve Ballmer’s podcast here,

Before we get into the current state of Open Data in India it would be useful to get familiar with the highlights of USAFacts,

USAFacts only uses government data as their source. Hence, may reports are based on data released in 2014 or 2015 (so let us stop complaining about Govt of India being slow!)

A report on the government’s performance (operational results, risk factors, analysis of financials) is released. The report follows the format of a public company’s annual 10K report to the Securities and Exchange Commission (SEC). Though they follow a corporate reporting structure, they don’t propose Govt should be a business.

It aggregates government statistics by combining federal, state, and local statistics to show the full picture of government. The data from each of these sources are in different formats and are compiled into a single database.

The same data from various departments could contradict each other. USAFacts decides which one to use.

The methodology of revenue of expenses is well explained on their site. It addresses double counting, grants from state/federal govt.

Many may not be aware, as part of Digital India program, the govt of India has a developed platform Open Government Data (OGD) Platform India – Data.gov.in, built on Drupal. This is a joint initiative of Government of India and the US Government. A good number of countries today are having Open Data sites.

There are many good reports & visualizations available on Data.gov.in but I wish the raw data was made available to Data Scientists to build some interesting models & reports (I still think it is available on data.gov.in but I am not finding it). We need access to datasets the way it is provided in the US by Data.gov.

SocialCops: On a mission to confront the world’s most critical problems through data intelligence.

Conclusion

Data.gov.in and Data.gov are similar, they are the official providers of government data. India needs a third party to build and analyze reports on the lines of USAFacts.org. There are many data enthusiasts these days in the country who could crowdsource their talent / passion / energy to work on this interesting project.

Govt of India and State Govts need to provide a lot more data for public consumption. It increases accountability on various govt departments which is in the interest of the country.

Only two states in India have an Open Data Policy: Sikkim & Telangana.

India should champion this first in English, then make all the reports available in all Indian languages (not just Hindi). The data is structured and to create reports from structured data in multiple Indian languages is very much achievable.