Sample calibration

In industrialized countries, response rates in household surveys are often low and have been steadily declining for at least two decades. Sample calibration techniques are therefore implemented, taking advantage of ancillary information to adjust the sample weights.In developing countries, response rates are typically high (usually above 90 percent). Complex adjustments of sample weights for non-response are thus not seen as critical and are rarely implemented. But there are signs that the situation is changing in some of these countries, which may affect the reliability of estimates of socio-economic indicators obtained from sample surveys.

The Sample Calibration project (jointly funded by the IHSN Trust Fund and the Knowledge for Change Program of the World Bank) aimed to assess the relevance of sample calibration techniques in the context of middle-income countries. Three countries, where “anomalies” were detected in the age distribution of the population in household surveys, have been selected for a pilot work. Sample calibration techniques were applied to the corresponding datasets.

The objectives of the project were to (i) assess if and how sample calibration techniques improve the quality of socio-economic indicators (in particular poverty and inequality measures), (ii) produce guidelines and training materials for the implementation of the techniques, and (iii) pilot a training course on sample calibration for official statisticians.

The project started with a sample calibration exercise, and with the preparation of related guidelines and training materials. This was followed by a regional hands-on training course. The workshop, organized by the World Bank was hosted by the Asian Development Bank in Manila. The workshop was attended by official statisticians from the Philippines, Thailand and Vietnam

All materials of the project, with the exception of the country survey datasets, are made available below.

Project status:

Closed: The project was implemented between June 2015 and August 2016

Sponsor(s):

World Bank Knowledge for Change Program (KCP), Grant No TF0A1077 and UK Department for International Development (DFID, Trust Fund No TF011722 administered by the World Bank Development Data Group)

Implemented by:

The project is implemented by the World Bank Development Data Group, with support of an expert from the Italian Statistics Office. The Asian Development Bank also contributed by hosting the training workshop in Manila (May 2016)

Type of output:

Feasibility study report, and training materials

Tools

The project makes use of ReGenesees, an open source R package developed by the Italian Statistics Office

Output

Feasibility study report

Modern large-scale household surveys are expected to provide high quality estimates of population parameters, compatible across data sources. However, signals of possible bias have occasionally been detected for some South-East Asia countries. For instance, the World Bank’s Household Survey Unit found significant discrepancies between survey-based estimates of age-sex and household size distributions and the corresponding Census counts for Vietnam, Thailand and the Philippines. A feasibility study has been carried out to investigate whether this issue could be solved through a preliminary calibration procedure. Three Proofs of Concept have been carried out using the ReGenesees open source package. The study showed that it was technically feasible to integrate a calibration procedure in the production workflow of all household surveys taken into account.

Software

ReGenesees (R Evolved Generalized Software for Sampling Estimates and Errors in Surveys) is a full-fledged R (open source) software developed and disseminated by the Italian Statistics Office. ReGenesees is a tool for design-based and model-assisted analysis of complex sample surveys.
The package (and its graphical user interface package ReGenesees.GUI) run under Windows, Mac OS, Linux and most Unix-like operating systems.

Training materials

This package (zip file) contains the course program, slides, and selected demos (R scripts) used for the 4-day training course on sample calibration organized by the World Bank and Asian Development Bank in Manila, May 2016. In addition to this package, a zip file containing exercises and solutions (R scripts) is made available.

This package (zip file) contains the exercises and R scripts (with related CSV datasets) used for the 4-day training course on sample calibration organized by the World Bank and Asian Development Bank in Manila, May 2016. During the course, statisticians from 3 participating countries also worked on their own survey datasets. These datasets are the property of the respective national statistical agencies, and are not publicly available.
Another package also available from this website provides the course program, slides, and selected demos (R scripts).

Related Resources

Tools

ReGenesees (R Evolved Generalized Software for Sampling Estimates and Errors in Surveys) is a full-fledged R (open source) software developed and disseminated by the Italian Statistics Office. ReGenesees is a tool for design-based and model-assisted analysis of complex sample surveys.

The package (and its graphical user interface package ReGenesees.GUI) run under Windows, Mac OS, Linux and most Unix-like operating systems.