Data is fairly inaccurate and imprecise, with the accuracy and precision getting worse the farther back in time we go, and also varying significantly across geographical regions. However, the database does not provide error bars or any other explicit quantification of uncertainty.

Data is fairly inaccurate and imprecise, with the accuracy and precision getting worse the farther back in time we go, and also varying significantly across geographical regions. However, the database does not provide error bars or any other explicit quantification of uncertainty.

−

The working paper uses [[Steve Broadberry's classification of data sources]], listing four broad types of data sources in decreasing order of reliability, along with the color coding that should show up in the database.<ref name=maddison-project-2013-update/>{{rp|5}} Unfortunately, the color coding does not seem to show up in Mac's Numbers or in Google Sheets.<ref name=maddison-project-2013-data/>

+

The working paper uses [[Steve Broadberry's classification of data sources]], listing four broad types of data sources in decreasing order of reliability, along with the color coding that should show up in the database.<ref name=maddison-project-2013-update/>{{rp|4}} Unfortunately, the color coding does not seem to show up in Mac's Numbers or in Google Sheets.<ref name=maddison-project-2013-data/>

{| class="wikitable" border="1"

{| class="wikitable" border="1"

Revision as of 17:44, 6 October 2017

The Maddison Project, also known as the Maddison Historical Statistics Project, is a project to collate historical economic statistics, such as GDP, GDP per capita, and labor productivity.[1][2][3] It was launched in March 2010 to continue the work of the late economic historian Angus Maddison. The project is under the Groningen Growth and Development Centre at the University of Groningen,[2] which also hosts the Penn World Table, another economic statistics project.[4]

This page describes the data and methods both produced by the explicit Maddison Project and produced by Angus Maddison before his death (since the Maddison Project is continuation of his work).

Summary

March 2010 for the explicit Maddison Project[1]1960s for the original work by Angus Maddison that was the genesis of the project.[5]:3

Data versioning

Only one update released as Maddison Project, published January 2013 with data till 2010.[5][6]Multiple versions by Angus Maddison, the last of which was published in February/March 2010.[7]

Focus

Historical: identify general ballparks and trends in living standards and economic growth over long time periods.Provide better insight into the timeline of the Great Divergence between Western Europe and other regions that were historically similarly situated, such as China and India.

Data description

Data dimensions and metrics

The data presented in the Maddison Project database is a partial function where:

The inputs (the dimensions) are country and year.

The metrics include:

Population: Included in

Real GDP

Real GDP per capita, expressed in 1990 international Geary–Khamis dollars. For simplicity, we will refer to this as GDP per capita.

Year dimension

While calendar years are the finest granularity at which data is presented, not all calendar years have data. Here is a description of how the granularity changes over time. Note that we count an year is present if there is data for at least one country for that year.[6][9]

Year range

Data granularity for Maddison 2010 (data till 2008)

Data granularity for Maddison 2013 (data till 2010)

2009 to 2010

Not present

Every year

1820 to 2008

Every year

Every year

1800 to 1819

Not present

Every year

1700 to 1799

1700 (only one year)

1700, 1725, 1750, 1775

1400 to 1699

1400, 1500, 1600 (once every 100 years)

1400, 1450, 1500, 1550, 1600, 1650 (once every 50 years)

Before 1400

1, 1000

1, 730, 1000, 1150, 1280, 1300, 1348

Country dimension

The country dimension includes most modern countries, but also includes historical countries (such as the former USSR) and regions within countries (such as centre-north Italy) for which it is easier to get historical data than their modern country equivalents. The working paper says:[5]

A related issue is that historical estimates often refer to different territorial entities than the countries within the borders of 1990, the basic unit of account used in the Maddison framework. He made many corrections for (minor) changes in borders (an overview will be provided in future work). However, moving back in time sometimes means that we have only estimates for Northern Italy (instead of Italy as a whole), for Holland (Netherlands) or for the Cape Colony (South Africa). When those smaller regions represented less than two-third of the population and/or the GDP of the modern country (within current borders), we have presented the estimates in italics to warn users.

In addition, data is also presented on aggregate regions (such as Western Europe) and the whole world. Data on aggregate regions and former countries is presented in bold.

Other information

As mentioned in connection with the country dimension, italics and bold are used for some cells.

Notes are added for some estimates, that are visible by hovering over the cell in spreadsheet software. A triangle at the top right of a cell indicates that there are notes for that cell.

Although the estimates should be color-coded (see #Accuracy and precision), this color-coding does not seem to show up in Google Sheets or Mac's Numbers software (it might show up only in Excel).

Caveats

Completeness

Data is not present for every combination of country and year. This could be because of the absence of reliable sources that could be used to construct the data.

Data on former countries continues to be calculated for years after that country ceases to exist as a geopolitical entity.

Accuracy and precision

Data is fairly inaccurate and imprecise, with the accuracy and precision getting worse the farther back in time we go, and also varying significantly across geographical regions. However, the database does not provide error bars or any other explicit quantification of uncertainty.

The working paper uses Steve Broadberry's classification of data sources, listing four broad types of data sources in decreasing order of reliability, along with the color coding that should show up in the database.[5]:4 Unfortunately, the color coding does not seem to show up in Mac's Numbers or in Google Sheets.[6]

Rank

Type of estimate

Color coding in database

1

official estimates of GDP, made by national statistical offices or by international agencies (UN, for example)

black

2

historical estimates based on the same methods and broad range of data

blue

3

historical estimates based on indirect proxy variables

orange

4

guesstimates

red

Design decisions that might lead to systematic biases

Maybe something about the purchasing power parity marking China as too expensive?[10]