1. What types of Chinese overseas investments do you track in your dataset?

2. Which geographic areas and time periods does your China dataset cover?

3. What information do you collect on Chinese official finance projects?

4. How do you collect your official finance data? How do you ensure the data is reliable?

5. Is the data available in the China dashboard the same as what is included in the static data?

6. How should I cite your China dataset and dashboard? Are there any licensing restrictions?

1. What types of Chinese overseas investments do you track in your dataset?

The Global Chinese Official Finance Dataset, Version 1.0 covers the known universe of projects officially financed by China in five major regions of the world from 2000-2014. The dataset includes both Chinese official development assistance (ODA) and other official flows (OOF) from the Chinese government to other countries with developmental, commercial, or representational intent.

Chinese ODA represents “Chinese aid” in the strictest sense of the term, but Chinese official finance (ODA + Other Official Flows) is sometimes used as a broader definition of aid. AidData’s dataset allows users to disaggregate Chinese official finance into its constituent parts (i.e., ODA-like and OOF-like projects) to examine Chinese assistance using either a narrow or broad definition of aid.

Export credit projects are coded as other official financing (OOF) flows that have commercial intent. China Export-Import Bank and China Development Bank are the two major providers of export credits for buyers and suppliers.

In addition to ODA and OOF, the dataset also includes one additional category, Vague Official Finance, which we assign to any flow that represents official financing, but for which we have insufficient information (i.e., the level of concessionality included in the project) to determine whether the project should be classified as ODA-like or OOF-like. This category has been created by AidData to be fully transparent about the uncertainty and imprecision in our data collection efforts.

We do not systematically capture "unofficial" financing such as Joint Ventures, Foreign Direct Investment, military assistance, or corporate aid.

2. Which geographic areas and time periods does your China dataset cover?

AidData’s Global Chinese Official Finance Dataset, Version 1.0 tracks official financing to 140 countries and territories between 2000-2014. The dataset includes official finance investments in five major regions of the world: (1) Africa, (2) the Middle East, (3) Asia and the Pacific, (4) Latin America and the Caribbean, and (5) Central and Eastern Europe.

The data collection process covered all low- and middle-income countries and territories -- 125 of which yielded at least one project during the time period. The dataset also includes some flows to 15 high-income countries in the specified regions. In the final version of the dataset, 140 countries and territories were found to have received at least some funding from China.

While we included North Korea in our data collection efforts, we caution users that this data is likely to be under-reporting actual official financing activity, given the secrecy of these financial flows. Users are advised to exclude North Korea from any statistical analysis or the generation of estimates of total Chinese ODA or OOF to specific countries. More information on how to use our data is available here.

Significant time and effort is required to standardize and synthesize large volumes of structured and unstructured open-source information. Therefore, our dataset is currently reported with a time lag of 2-3 years. Depending on the availability of resources, we would like to expand this time series to 2017.

3. What information do you collect on Chinese official finance?

Our unit of analysis is generically referred to as a "project". Broadly defined, a "project" is a discrete transfer of goods, services or cash. Apart from discrete projects, the dataset also captures individual activities that are subsets of larger projects, cash payments, economic and technical agreements, and MOUs for technical and economic cooperation.

We have differentiated records that refer to "mega deals" or a conglomerates of projects/flows by marking them as "umbrella" projects. This is to reduce the chance of double counting across separate records; we generally recommend that users exclude umbrella projects from an estimates that they generate of aggregate amounts of Chinese official financing.

We believe that if you want to know what providers of official finance are really doing, you have to "follow the money". That is to say, you must follow projects through their entire life cycle. By tracing the progress of projects over time and triangulating a wide variety of sources (all of which are posted on the individual project pages on china.aiddata.org), we categorize each record in our database as either a pledge, an official commitment, a project in implementation, a completed project, or a suspended/cancelled project.

A pledge is a "verbal, informal agreement" between the development partner and partner country. We do not include pledges in our reporting of aggregate financial amounts of Chinese official financing, because there is no concrete evidence that these pledges have progressed. We advise all users to do the same in excluding pledges for their analysis. An official commitment is a firm obligation, expressed in writing and backed by the necessary funds, undertaken by an official donor to provide specified assistance to a recipient country or a multilateral organization.

We also include project cancellation and suspension data. A central question for many analysts of aid allocation focuses on the conditions under which either donors or recipients choose not to follow through on their commitments. In order to identify these conditions, one would ideally have data on the cases where the donor "changed its mind" or chose to suspend/cancel a project. Additionally, suspended or canceled projects can be used as evidence to understand changing relations between the Chinese government and the recipient.

In total, we track 63 variables for each project record in the dataset. Complete definitions for each dataset field are included in the readme file, and more detailed information is available in the methodology document.The most important variables include:

project_title - title of the project, as created by our researchers.

year - year in which the project was agreed upon between the donor and recipient (i.e. year committed). In the case of pledges, this is the year the project was pledged (e.g. announced).

recipients_all - the partner country(ies) receiving of the financing or technical assistance.

status - the current state of the project. Options include: pipeline: pledge; pipeline: commitment; implementation; completion; suspended; and cancelled.

intent - the donor’s intent for the project. Options cover the following categories: Development, Commercial, Representational, and Mixed (some development, no development, uncertain).

We do not track disbursements, because this would require information on yearly aid disbursements that is not usually provided in open source documents. However, by using the "status" variable, users may classify projects that have been implemented or completed as disbursements of Chinese official financing; during the quality assurance phase, our team strives to update the status variable for each project record to accurately reflect the most up-to-date status of the project.

4. How do you collect your official finance data? How do you ensure the data is reliable?

AidData uses an innovative, open source data collection methodology called Tracking Underreported Financial Flows (TUFF) to capture the known universe of Chinese official financing flows at the project level from 2000-2014.

The TUFF methodology is a transparent, systematic, and replicable set of procedures for standardizing and synthesizing information from four types of sources:

When conflicting information is reported by different sources, we prioritize the official sources or the information that most sources report. For the purposes of financial reporting, when all all of our sources are media reports and a majority of the available sources do not agree upon a consistent number, we default to the lowest financial estimate to err on the side of caution.

The TUFF methodology has been stress-tested, refined, codified, and subjected to scientific peer-review, resulting in dozens of working papers and journal publications. The use of TUFF-derived data on Chinese development finance has also resulted in more than 90 stories in elite and mass media outlets, including articles in The Guardian, The Economist, and the Financial Times.

A recent publication in the Journal of Development Studies by Muchapondwa et al. (2016) also found that field-based data collection methods and TUFF-based produce generally very similar data. However, field-based data collection is prohibitively costly and complex if one is trying to achieve comprehensive, global coverage of China’s official financing activities. Nor is it sustainable over the long-run.

We have made several improvements to our TUFF methodology with this iteration:

Reduced reliance on media sources: The latest version of the dataset relies on media reports for only 56% of all sources -- this is down from 89% in the original Chinese Office Finance to Africa dataset.

Increased use of official and academic sources: Official government data and documentation from China, counterpart countries, and international organizations now constitute 27.6% of all sources (up from 21% in the 1.0 version of the dataset). Peer-reviewed journal articles and other academic publications represent 6.8% of all sources (up from 1% in the 1.0 version of the dataset).

Expanded number of sources for each project: There has also been an increase in the average number of sources that underpin each project record -- from 2.13 sources (in the 1.0 dataset version) to 3.6 sources (in the latest version of the dataset)

More complete information records: The average project record completeness score has increased from 6.09 (in the 1.0 dataset version) to the 6.37 (in the current version of the dataset). This means that an increasing number of the core fields (e.g. transaction amounts, flow types, and commitment years) for each project record are populated.

To assemble AidData’s Global Chinese Official Finance Dataset, Version 1.0, we have collected project-level information from over 15,000 distinct information sources. On average, each project entry is informed by 3.6 sources. Although 24% of project records are based on information from a single resource, they represent only 6% China’s total financial commitments globally. For each project, we include a "Health of Record" score that rates its completeness and verifiability.

All location information was collected from the source documentation used to create project records. Media reports, government document, and academic articles can provide very detailed information on the sub-national location of projects. AidData’s geocoding methodology, which has also been adopted as a global reporting standard by the International Aid Transparency Initiative, has been applied to more than 1,900 Chinese development projects in our database. Our geocoding team triangulated the available information about each project and applied this methodology, which we also use to code projects from the World Bank, the African Development Bank, the Asian Development Bank, and other development finance institutions around the world.

The dashboard is primarily a tool for visualization and user feedback, not analysis. The visualization features give a user the opportunity to explore possible trends related to the nature, impact and distribution of Chinese aid and development finance projects. However, it does not substitute for a thorough spatial econometric analysis needed to derive to substantive conclusions (for such an analysis, please see the AidData Working Paper, Aid on Demand: African Leaders and the Geography of China's Foreign Assistance).

Generally, the spatial base layers provided are an aggregate across our time-series. If a user is looking at a single year of Chinese development finance data, there could be a mismatch between what the base layers are displaying and the year. This could lead to users to draw spurious correlations. In addition, when the map renders national either precision code 6 or 8, it will place them in the center of the country. It is important to recognize that these projects are not actually located at that set of latitude and longitude coordinates.

6. How should I cite your dataset and dashboard? Are there any licesning restrictions?