The Human Fertility Database

Overview

The main goal of the Human Fertility Database (HFD) is to provide access to detailed high-quality data on cohort and period fertility to a broad audience of users. We are trying to develop the HFD into an important resource for monitoring, analyzing, comparing, and forecasting fertility as well as for studying causes and consequences of fertility change in the industrialized world. The uniform format of HFD data will facilitate comparative analysis across countries and regions and encourage researchers to move beyond the simple indicators such as the period Total Fertility Rates.

The HFD provides age-, cohort- and (whenever possible) birth-order-specific fertility rates, crude, cumulative and total fertility rates, tempo-adjusted total fertility rates, mean ages at birth, standard deviation in mean ages at birth, parity progression ratios, and also cohort and period fertility tables for national populations or areas. In addition, the HFD provides input data from which these measures and fertility tables are computed. The input data consist of detailed birth counts and estimates of female population exposure obtained from officially recognized sources.

The following features should make the HFD particularly attractive to its users:

High level of detail, which will provide the possibility to address different data needs and research questions;

Uniformity of methods and data design, which implies comparability of all data across countries, cohorts, and periods;

The emphasis on displaying order-specific fertility indicators, which should encourage a higher level of sophistication in fertility analyses and forecasts and further innovation in methodological research on fertility;

Free access to all data upon registration.

Scope and basic principles

For each population, one and the same set of methods is applied for the production of uniform output data. This facilitates comparability of the HFD data and indicators across countries and time.

The HFD is limited to populations where the registration of births by official statistical agencies is virtually complete and where population estimates over the range of reproductive ages are reliable. Methods employed in the HFD for obtaining output data do not include any treatment or adjustment of the input data for completeness and coverage.

A companion database - Human Fertility Collection (HFC) - will include fertility rates and indicators constructed by other researchers, research organizations and statistical agencies using various methods and data shapes. The HFC data will be, therefore, less consistent and comparable across countries and time than the HFD data. At the same time, the HFC will be more flexible and will contain fertility data for countries and years that cannot be included in the HFD, including estimates of order-specific fertility for countries where birth order registration is restricted to marital birth order only. It is planned that the HFC will also feature detailed data sets, historical data, and relevant documents.

Similarly to the Human Mortality Database (HMD), we are following as much as possible four guiding principles: comparability, flexibility, accessibility, and reproducibility.

We provide complete documentation of all data available through this site as well as full descriptions of methods applied and specific features of country data sets. A complete description of the HFD methodology is given in the Methods Protocol. For each country, the description of data sources is given in the References document posted on the respective country page. General country-specific information (completeness, coverage, data quality issues, definitions, etc.) can be found in Background and Documentation files within each country section.

The HFD provides free access to the data. Before gaining full access to the database, you must become a registered user, which requires accepting our user agreement.

If you have comments or questions, or trouble accessing the database, please contact us.

Computation of fertility rates and fertility tables

The HFD process for computing output fertility indicators from input data on births and population can be briefly described as a sequence of steps, which are specified below. In the process, the following data, indicators and outputs are produced:

Unconditional cohort and period age- and (whenever possible) order-specific fertility rates from birth counts and population exposures;

Summary indicators of cohort and period fertility from the unconditional age- and order-specific fertility rates;

Data on the distribution of women by age and parity from censuses or population registers;

Conditional age and parity-specific fertility rates, period fertility tables by age and parity.

Detailed descriptions of the HFD methodology are given in the Methods Protocol. The following items provide a concise overview of data processing and methodology.

Births. We collect detailed annual data on live births over the longest possible time periods. Ideally, birth count data are classified by single years of age and year of birth (cohort) of the mother and by birth order of the child (biological birth order). In many cases, however, input birth counts are less detailed. In many countries information about the mother's birth cohort is not available. In some countries the age of the mother is available by five-year age groups rather than one-year age groups for some periods of time; this is typically the case for data for the period before 1960. To achieve the uniformity of data format with respect to age and birth order, additional splits and adjustments are performed for the HFD. For many countries or time periods, birth data by birth order are not available. In such cases, order-specific fertility rates and order-specific mean ages at birth as well as cohort and period fertility tables can not be obtained and are not featured in the HFD.

Population denominators. In the HFD, data on the female population as well as on the total population (i.e., men and women together) are used. There are two types of data for the female population. The first type specifies population exposure by age, which is needed for the computation of unconditional fertility rates. For most countries, the female population exposure is estimated using data on population size and deaths from the HMD. For countries that are not included in the HMD, these types of data are collected together with the data on births. The second type of population data contain counts of women by age and parity, which are needed for the computation of conditional fertility rates, and which serve as the major input for period fertility tables. These data are usually available from population censuses or registers, and, in rare cases, from large-scale population surveys. Data on total population are used only for the computation of crude birth rates.

Fertility rates. Fertility rates are ratios of birth counts to corresponding population exposures. Unconditional age-specific fertility rates relate births specified by age of the mother or by age of the mother and birth order of the child to all women of a certain age. Conditional age- and order-specific fertility rates measure childbearing intensity among women of specific age and parity (e.g., second births are related to women of parity one only). Cohort fertility rates are computed for every combination of cohort and age. Period fertility rates are computed for every combination of calendar year and age. Furthermore, the HFD includes population exposures and births counts by Lexis triangles, making it possible for an advanced user to compute fertility rates and fertility tables in any configuration desired.

Summary measures. The summary measures are crude birth rates, total fertility rates (including completed cohort fertility), tempo-adjusted TFR, mean ages at birth, cohort parity progression ratios, and standard deviation in mean ages at birth. The crude birth rate is a simple ratio of total live births to total population in a given year. It is expressed as the number of live births per 1,000 of the population. The other summary measures are based on fertility rates by age and (when possible) by birth order. The TFR equals the sum of age-specific fertility rates over the entire range of reproductive ages. We also provide the TFR by age 40, based on a summation of age-specific fertility rates over all ages under 40. These two quantities show the average number of children a woman from the population of interest would have by the end of her reproductive life (or by age 40) if she experienced at each age the observed age-specific fertility rates for a given year. The completed cohort fertility shows the average number of children (or children of a specific birth order) born to women belonging to certain cohort over their whole reproductive lives. The cohort parity progression ratio expresses the probability of giving birth of birth order i+1, conditional on reaching parity i. The HFD also provides the period TFR adjusted for tempo effects using the Bongaats and Feeney (1998) method. Cumulative fertility rates are based on a summation up to the indicated age limit, shown for each single age category. Mean age at birth and mean age at birth by age 40 are computed from the schedule of age-specific fertility rates. They show average ages at birth weighted by age-specific fertility rates over the entire range of reproductive ages or over reproductive ages under age 40, respectively. Standard deviation in mean age at birth is computed on the basis of age-specific fertility rates and mean age at birth; this measure shows the extent of variability from the mean age at birth, computed from the entire schedule of age-specific rates and over the range of ages under age 40. All the summary measures are computed for all birth orders combined and for specific birth orders.

Cohort fertility tables by age and parity. These are increment-decrement life tables, which model the process of childbearing in female cohorts by age and parity. In principle, they describe a two-dimensional cohort progression toward older age and higher parities. Women of the cohort of interest are moving from parity null to parity one, from parity one to parity two and so on as they age. For each cohort, the life table functions are computed from the schedule of age- and parity-specific fertility rates as the major input data. The distribution of births by age of mother and birth order in the table and the parity distribution of the table population of females correspond to the observed (real) fertility trajectories of cohorts analyzed.

Period fertility tables by age and parity. Many functions in these tables are identical to those in cohort fertility tables and their construction is based on comparable formulas. The period fertility tables describe the fertility progression in a 'synthetic cohort' of women on the basis of conditional age- and parity-specific fertility rates observed during one calendar year. In other words, the tables give a period snapshot of fertility of many female birth cohorts and do not correspond to childbearing history of any real cohort. The key input in period tables is the age- and parity-specific distribution of the female population of reproductive age (exposure population, see "Population denominators" above). These distributions are obtained from cohort fertility tables, "golden" censuses that provide the parity distribution in one base year or directly from population censuses or registers. In the latter case, the fertility tables are census or register-based. The main output of the period fertility table is the parity- and age-adjusted total fertility rate, PATFR, and its order-specific components.

Data issues

The raw data are collected from official and other validated sources, especially national statistical offices, statistical and demographic yearbooks, special tabulations from national registry data and official statistics websites. We are collecting relevant documents by data providers and also scientific literature explaining data collection routines, related regulations and practices and other factors affecting the quality of data on births and female population.

Special attention is paid to the following aspects of 'raw' data:

completeness and timeliness of birth registration;

coverage of the whole territory and all population groups by the birth registration;

actual definitions of live birth and availability of data on live births rather than on all births;

availability of information about biological birth order rather than information restricted to marital births or births within current marriage;

age categories of mothers for which birth data are reported;

reliability of census and registry reporting of the parity of women.

Towards consistency and uniformity of the data

The HFD aims at providing opportunities for comparative studies on fertility in different countries and/or time periods. In this regard, consistency across the whole data universe is an important priority. That is why we apply a uniform set of procedures to each population.

The desire for uniformity is hindered by the significant variability in the original data formats and the lack of sufficient detail in these 'raw' data. For example, the original birth data can be provided for one-year vs. five-year age groups, they may not always include cohort dimension, they may show broader or narrower ranges of available ages, they may include births with unknown birth order, or they may show total births instead of live births.

The HFD methodology includes procedures for the transformation of any set of raw data into data classified by single years of age ranging from age <=12 to 55+, by single-year birth cohorts, and (whenever possible) by birth orders varying from 1 to 5+. Births with unknown age of the mother are distributed proportionally across the range of known ages of the mother. Five-year age groups are additionally split into single-year ages by means of spline interpolation. For each age, births with unknown birth order are distributed proportionally across known birth orders. Birth orders higher than five are aggregated into birth order 5+. If needed, age-specific birth counts are extrapolated toward younger and older ages to cover the range of ages from 12 to 55 years. For each age, births are additionally split by year of birth of the mother (if such information is not present in input data).

At the same time, the data for each country are carefully checked and processed, with a view on their specificities, which are outlined in the country documentation file and, if needed, country experts are consulted. The data processing concerns checks for specific data problems and correction of errors, consultations with local demographers and statisticians, investigations of country documents and literature, comparisons with alternative fertility data and estimates. These procedures help to assure the high data quality standard for each HFD country.