The Australian Bureau of Statistics (ABS) has embarked on a project to create a Statistical Longitudinal Census Dataset (SLCD) by linking records from the 2006 and subsequent Population Censuses. The SLCD will be based upon a 5% sample of the population. Since names and addresses will not be retained, probabilistic matching will be used to link records.
The ABS has investigated methodologies and applications for linking data sets over time. Probabilistic matching was used on ABS data in order to evaluate methodologies and identify particular issues with Australian census data. Results from this paper do not include Date of birth or Meshblock data as it was not collected in Census 2001. This information will improve the linkage provided it is reported well.
This paper outlines previous work in this area, results from similar projects, the methodology proposed for linkage and preliminary results achieved so far.