blogstats stat

Meta

Category: 022 Web services

There are (at least) two big challenges official statistics will be faced with in the next few years and which will possibly change its quasi-monoplistic position.

.

On the input side it’s Big Data

‘“Big Data” is a term used to describe massive information stores – generally measured in petabytes and exabytes – and also refers to the methods and technologies used to analyze these large data volumes. The core principles of Big Data (data mining, analytics) have been around for some time, but recent technology has enabled the collection and analysis of previously unimaginable data volumes at extremely high speeds.’ So says for example SAP and gives some examples how Big Data will change your life (big words and they show how big software and hardware players begin to occupy the field).

Official Statistics has already put this on the agenda! And so has the in United Nations Statistics Division’s (UNSD) Friday Seminar on Emerging Issues, 22 February 2013.

The discussion is launchedand as mentions the HLG paper: ‘To use Big data, statisticians are needed with a different mind-set and new skills. The processing of more and more data for official statistics requires statistically aware people with an analytical mind-set, an affinity for IT (e.g. programming skills) and a determination to extract valuable ‘knowledge’ from data. These so-called “data scientists” can be derived from various scientific disciplines.’

.

On the output side it’s (Linked) Open Data in combination with APIs

Open Data is not at all a new topic for Official Statistics. National Statistical Institutes were forerunners in openly providing data; organizations like UN or EUROSTAT went this way as well.

Several Open Data initiatives (USA, UK, France, EU …) consist mostly of data catalogues, and are in that sense also public relations initiatives. A large part of the data so provided consists of statistical data already available, often, on the website of the National Statistical Institute concerned. The EU portal, for instance, offers 5716 datasets of statistical data from a total of 5893 (as of April 2013).

Further central questions are the licensing of data, as well as their availability in machine-readable formats.

Machine-readable statistical data, Application Programming Interfaces (APIs) to the data and especially Linked Open Data LOD (–> essentials, –>tutorial) open the way to creative applications and new models of presenting information.

An Europe-wide Linked Open Data (LOD2) project ‘was launched in September 2010 and will run for four years. It addresses exploitation of the web as a platform for data and information integration, and the use of semantic technologies to make government data more useable.’

Looking for third-party APPs

Data Providers are looking at applications or mashups made with their data with much interest, and they are even sponsoring competitions and hack days (like Apps4EU) to stimulate the reuse of open data, especially from the public sector.

The most popular APP creator and statistical storyteller is Hans Roslings with Gapminder. Rosling himself is a pioneer in fighting for open data.

Open Data, Linked Open Data and APIs are changing the dissemination paradigm of statistical agencies. More people with new skills will do new things. Coding is becoming the new literacy, says i.e. Garrett Heath in his advice for his unborn daughter: ‘I was blown away that the buzz is not around mobile apps, but rather around using APIs. Ten years ago saw the creation of the social networking platforms. The past five years has been about accumulating the data. The next five years and beyond will be about interpreting that data. [My daughter will have access to] a boatload of interesting data sitting in accessible databases that is waiting to be exposed and interpreted with her [the programmer’s]) creativity.’

Storytelling with data

Storytelling based on data is less and less the domain of statistical agencies. Storytelling can access multiple (new) resources and take on new forms. To satisfy the basic idea of an easily understandable and appealing presentation of statistical content, statistical institutions cannot avoid taking certain measures to improve their content and presentation. The “composer” must know how the music is to be played, that is as a quick, competent, qualitatively unique, reliable and indispensable data source.
But this presentation job can no longer be done on one’s own: cooperative partnerships are necessary and have already begun to some extent, both with partners outside statistical institutions and between such institutions. This discussion has been launched.

.

And this: Many small open data give big data insights

FORGET BIG DATA, SMALL DATA IS THE REAL REVOLUTION says Rufus Pollock co-Director of the Open Knowledge Foundation : ‘… the discussions around big data miss a much bigger and more important picture: the real opportunity is notbig data, but small data. Not centralized “big iron”, but decentralized data wrangling. Not “one ring to rule them all” but “small pieces loosely joined”.’

Looking for important statistical indicators of European countries? Comparing these countries? Taking the application to your own website? Making a brochure of it?

All this is provided by a newly designed application on Statistic Switzerland’s portal.

Embed

And download all countries as a brochure

Open Data

The Source Data (from Eurostat and Swiss Statistics) are available as an EXCEL file: So data are open and the app made from these data is open, too. It provides selecting and embedding and also the output of all indicators as a PDF file. It may also be embedded into third party websites or other apps can be written by other people.

App made with a CMS

This Portrait-App is one of several Apps of the same flavour. There are also portraits of the 26 Swiss Cantons, the biggest Cities and and the (more than) 2500 Communes.

A Content Management System helps building these Portrait-Apps once the data are in correct shape. And this in a very short time (hours).

An example of an API access to statistical data

The U.S. Census Bureau now offers some of its public data in machine-readable format. This is done via an Application Programming Interface (“API”).
Based on this API an App has been developed helping to query data from the Cenus 2010:

No data without legal clarification. The Census Bureau does it like follows:

‘Use
You may use the Census Bureau API to develop a service or service to search, display, analyze, retrieve, view and otherwise “get” information from Census Bureau data.AttributionAll services, which utilize or access the API, should display the following notice prominently within the application: “This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.” You may use the Census Bureau name in order to identify the source of API content subject to these rules. You may not use the Census Bureau name, or the like to imply endorsement of any product, service, or entity, not-for-profit, commercial or otherwise.’

All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff

★★★★★

All the above, plus: Link your data to other people’s data to provide context

‘Linked data is data in which real-world things are given addresses on the web (URIs), and data is published about them in machine-readable formats at those locations. Other datasets can then point to those things using their URIs, which means that people using the data can find out more about something without that information being copied into the original dataset.

This page lists the sectors for which we currently publish linked data and some additional resources that will help you to use it. Most sectors have one or more SPARQL endpoints, which enable you to perform searches across the data; you can access these interactively on this site‘.

Evaluating

What’s the effect of open data? Some journals (like the Guardian) make ample use of open data, but there is no wide-spread activity or commitment or lots of evaluation studies to be seen. Infoweek just published an article about US open gov and found that there is a lot to be done as only small groups seem to take notice of this government activity. ‘The most difficult part of open government may be getting the public to participate. … the “if you build it, they will come” approach simply doesn’t work.’ (InformationWeek, Feb 21, 2011: Open Government Reality Check: Federal agencies are making progress on the Obama administration’s Open Government Directive, but there’s still a long way to go. Here’s our list of top priorities.)

For more and more online users the device of choice is a mobile device and for more and more of these users ‘Apps are the Web and the Web is Apps”.

Applications (Apps) for mobile devices can be downloaded and installed in seconds. These apps focus on certain needs and perhaps half a dozen of Apps meet the daily online demands for you and me.

With Apple’s planned App store for laptop and desktop computers these devices join this philosophy, too. So what about the future of Websurfing using classic browsers? And what about the future of complex Websites offering many levels of browser navigation and tons of pages delivering information?

The discussion (the fight) is under way and the users will decide.

For information suppliers like statistical agencies this issue is of huge importance.

How to ensure the mission for public information and democracy given such developments in the online world?

– with traditional websites?
– with (small) Apps (or Widgets) with specific, user-focused information portions?
– or both (for how long)?
– with integration into existing Apps or platforms where people are, like facebook or Google?

There are already today some interesting developments in statistics’ dissemination giving partial answers.