States and the International System

The other international data pages on this site may also contain relevant data.

COW Interstate System:
The latest official list of all members of the
Correlates of War Project's interstate system, including all major powers, the
composition of the system, and all dyads in the system.

Spatial-Temporal Domain: Entire world, 1816-2004

Variables Included: States.csv: COW state abbreviation, number,
name, and entry and exit years of statehood; Majors.csv: entry and exit
dates of major power status; System.csv: annual composition of the
interstate system (one entry per nation-year); Dyads.csv: annual dyadic
composition of the interstate system (one entry per nondirectional dyad-year).

Gleditsch and Ward
Interstate System: An alternative list of states in the system described in a 1999 International Interactions article, based on somewhat different coding rules from the COW system and with somewhat different dates for many states. The authors provide a list of qualifying states, a list of microstates, and documentation on each case that is included.

Spatial-Temporal Domain: Entire world, 1816-2002

Variables Included: State number, name, and abbreviation; dates of system
entry and exit.

EUGene Software: Expected Utility Generation and data management program, written by D. Scott Bennett and
Allan Stam; this is a Windows-only program that generates data sets for the study of
international conflict, with a variety of commonly used variables. More data sets and
variables are added frequently, so this list may not be complete -- please check the official
EUGene web site for the latest list and any updates.

Spatial-Temporal Domain: EUGene can generate data sets at the directed-dyad year,
non directed-dyad year, country-year, and directed-dispute dyad unit of analysis, 1816-1993
(or any subset of this time period, including contiguous, major power, or "politically
relevant" dyads)

Russett/Oneal Triangulating Peace data: Replication data for Bruce Russett and John Oneal's 2001 book. These data sets have been widely used by
other scholars as a starting point for their own analyses. The above link is to the STATA version of the data; they also provide an ASCII version

Spatial-Temporal Domain: Varies by data set; generally includes the
entire international system for the 1950-1989 period.

Variables Included: Varies by data set; generally includes data on
international system membership, trade, alliances, international organization
membership, civilizational identity, and various control variables.

ICOW Historical State Names data: From the
Issue Correlates of War, or ICOW, Project. This file contains a PDF document
with alternative names for nation-states in the COW interstate system,
including traditional names, alternate spellings of common names, foreign spellings,
and some colonial-era names. This data set is important for scholars attempting to code
historical data, as many older source materials use country names that are no longer
used or understood, meaning that coders might ignore or mis-code data.

Spatial-Temporal Domain: entire world, 1816-2001 (approximately; this is
intended to be a supplement to the latest version of the COW interstate system, which is
available above)

Variables Included: COW country code and name, alternative names for state

ICOW Colonial History data: From the ICOW Project.
This file lists each nation-state's colonial rulers as well as dates and processes of independence. This data file is an Excel spreadsheet and a CSV comma-delimited file, and the enclosed documentation is an RTF word processing document; the folder containing these two files is compressed in ZIP format.

Spatial-Temporal Domain: entire world, 1816-2004 (approximately; this is
intended to be a supplement to the latest version of the COW interstate system, which is
available above)

Variables Included: COW country code and name, dates of interstate system membership, name of colonial ruler, date and process of independence (including comparable data from the COW Territorial Change and Polity 2 data sets as well as ICOW).

List of Stamp-Issuing
Entities: From Linn's Stamp Magazine; technically not an academic data resource,
but nonetheless an informative analysis of the numerous entities -- including states,
dependencies, quasi-states, and others -- that have issued stamps at various times
in the last two centuries. Most entities' entries have useful information about
histories, colonial rule, dates of independence, and similar topics. Linn's also
provides a country
name cross-index to help trace names that have changed over time.

Spatial-Temporal Domain: unclear (apparently covering the time
since stamps were first issued)

Variables Included: Military capabilities: military personnel and military
expenditures; Industrial capabilities: iron/steel production and energy consumption;
demographic capabilities: total population and urban population.

EUGene Software: Expected Utility Generation and data management program, written by D. Scott Bennett and
Allan Stam; this is a Windows-only program that generates data sets for the study of
international conflict, with a variety of commonly used variables. More data sets and
variables are added frequently, so this list may not be complete -- please check the official
EUGene web site for the latest list and any updates.

Spatial-Temporal Domain: EUGene can generate data sets at the directed-dyad year,
non directed-dyad year, country-year, and directed-dispute dyad unit of analysis, 1816-1993
(or any subset of this time period, including contiguous, major power, or "politically
relevant" dyads)

Variables Included: Varies by data set; the data includes six different units of analysis: the alliance, the alliance phase, the alliance member, the state-year, the dyad-year, and the directed dyad-year.

Variables Included: Alliance dates, members, and type; additional variables
indicate whether the alliance began before 1816 or continued after 2000.

Doug Gibler's 1648-1815
Alliance Data: An extension of the basic COW alliance data set to the centuries before the Congress
of Vienna, introduced in Gibler's 1999 International Interactions article "An Extension of
the Correlates of War Formal Alliance Data Set." Because the standard COW
interstate system list only covers the period since 1816, Gibler has also created a 1648-1815
system list, which will be
needed for users of this data set. Note that the download for this data set has disappeared with
Gibler's move from Kentucky to Alabama; interested users should contact him.

Doug Gibler's Territorial Settlement
Alliance Data: From Gibler's 1996 Conflict Management and Peace Science article "Alliances that Never
Balance: The Territorial Settlement Treaty." This is an Excel spreadsheet with a list of
all alliances from the larger COW alliance data set that contained territorial settlement
provisions, with additional details about each one. Note that the download for this data set has disappeared with
Gibler's move from Kentucky to Alabama; interested users should contact him.

Spatial-Temporal Domain: All international alliances with territorial
settlement provisions that were begun 1816-1977

Spatial-Temporal Domain: All members of COW interstate system, 1816-2005

Variables Included: Country code and name, dates of system membership, dates of acceptance of each global or regional organization, total number of treaty obligations for pacific dispute settlement and for territorial integrity in each year; available as raw treaty-level data as well as compiled into state-year and dyad-year-level data sets for ease of merging with other data.

Russett et al. Data Sets: A variety of data sets employed in publications by Bruce Russett and colleagues -- including
John Oneal, Michaelene Cox, David Davis, and Harry Bliss -- that examine international
conflict in the post-World War II period. These data sets have been widely used by
other scholars as a starting point for their own analyses, and include data on both
alliance and international organization membership since 1950 that is not currently
available elsewhere.

Spatial-Temporal Domain: Varies by data set; generally includes the
entire international system for the 1950-1989 period.

Variables Included: Varies by data set; generally includes data on
international system membership, trade, alliances, international organization
membership, civilizational identity, and various control variables.

Minimum Distance Data and Distance between Capital Cities Data: From
Kristian Gleditsch and Mike Ward, as introduced in their 2001 Journal of Peace Research article. Two different measures of distance between each country in the international system, available in several different formats and for both the Gleditsch-Ward and COW international systems.

Shatterbelt data: State-level data on the shatterbelt
status of geographic regions, as used in Hensel and Diehl's 1994
Political Geography article -- now extended
through 1992 by David Reilly. This data set is compressed in .ZIP format and is provided in .CSV comma-delimited format,
along with a brief codebook in PDF format.

Spatial-Temporal Domain: recent snapshot (not available in any time series format)

Variables Included: infectious disease (affected areas, population in affected areas,
area and population affected by malaria); physical geography (latitude and longitude of each country's centroid, mean elevation, mean distance to nearest ice-free coastline, mean distance to nearest ice-free coastline or sea-navigable river, distance from a country's centroid to nearest coastline, distance from a country's centroid to nearest coastline or sea-navigable river, total population, percent of population within 100km of the coastline, percent of the population within 100km of the nearest coastline or sea-navigable river, percent of land area within 100km of the coastline, percent of land area within 100km of the nearest coastline or sea-navigable river, percent of population in the geographic tropics, percent land area in the geographic tropics, and the typical population density an average person experiences); climate zones; agriculture (FAO soil suitability (two types), percent of Matthews'* cultivated land in each Köppen-Geiger** climate zone, percent of Matthews' cultivated land -- using a revised classification scheme -- in each Köppen-Geiger climate zone, percent land area in each Köppen-Geiger climate zone weighted by Matthews' cultivated land, percent land area in each Köppen-Geiger climate zone weighted by a revised Matthews' cultivated land classification).

Getty Thesaurus of Geographic Names: A searchable interface allowing you to look up any geographic feature; the database
will return details such as the latitude and longitude of the feature.

Social Science Data Collections

These resources offer access to a wide variety of social science data sets,
some of which are included in the other data pages on this site but many of which
are not (usually because they fall outside the general categories used here).