October 30, 2012 FCW.COM
31
DrillDown
A 21st-century approach
to democratizing data
"Unbelievable jobs numbers... These Chi-
cago guys will do anything," Jack Welch
tweeted.
Not surprisingly, the recent steep drop
in the unemployment rate has given rise
to conspiracy comments and discussions
about how the rate is derived. Maybe the
employment rate is in ated. Maybe it is
understated for months. Maybe seasonal
adjustments play a part. Maybe.
Recent "democratizing data" con-
cepts hold great promise for improv-
ing accountability and even increasing
value from the billions of dollars spent on
thousands of government data-collection
programs. Yet when doubts dominate
market-moving, election-shifting data, it
is clear that America needs government
to change more than how it distributes
data. Should government collect the same
data and in the same way that it did in the
last century? More important, should gov-
ernment s central role in collecting and
disseminating data be changed?
Every day an organization near Bos-
ton sends its agents out to collect the
prices of thousands of items sold by
hundreds of retailers and manufactur-
ers around the world. The agents are
dozens of servers using software to
scrape prices from websites. In near-
real time, the price data is collected,
stored, analyzed and sent to some of
the largest investment and nancial
organizations on the planet, including
central banks.
This is the Billion Prices Project run
by two economics professors at the Mas-
sachusetts Institute of Technology. With
a 21st-century approach, two people can
collect and analyze the costs of goods
and services purchased in economies
all over the world using price data read-
ily available online from thousands of
retailers. They mimic what consumers
do to nd prices via Amazon, eBay and
Priceline. The Billion Prices Project does
not sample. It uses computer strength to
generate a daily census of the price of all
goods and services. It routinely predicts
price movements three months before
the government Consumer Price Index
(CPI) announces the same.
Beginning in the early 20th century, the
Bureau of Labor Statistics responded to
the need to determine reasonable cost-
of-living adjustments to workers wages
by publishing a price index tied to goods
and services in multiple regions. Over
time, government data collections grew
through the best methods available in the
20th century --- surveys and sampling
--- and built huge computer databases
on a scale only the government could
accomplish and afford. Even today, the
CPI is based on physically collecting ---
by taking notes in stores --- of the prices
for a representative basket of goods and
services. The manual approach means
the data is not available until weeks after
consumers are already feeling the impact.
The federal government s role as chief
data provider has resulted in approxi-
mately 75 agencies that collect data using
more than 6,000 surveys and regulatory
lings. Those data-collection activities
annually generate more than 400,000
sets of statistics that are often duplica-
tive, sometimes con icting and generally
published months after collection. The
federal government is still investing in
being the trusted monopoly provider of
statistical data by developing a single por-
tal --- Data.gov --- to disseminate data it
The Internet has become a ubiquitous kiosk for posting information. The government's role
in collecting and disseminating data should change accordingly.
BY CHRISTOPHER J. LYONS AND MARK A. FORMAN
"
"
Non-government entities are
increasingly lling the information
quality gap, generating the timely,
trusted data and statistics that
businesses and policy-makers use ---
and pay for.