Find out what data is available in the IDI, including the data dictionaries for each collection.

The IDI contains de-identified data about people and households, which comes from a range of government agencies, Stats NZ surveys including the 2013 Census, and non-government organisations.

The IDI is updated (or ‘refreshed’) regularly to keep the data up-to-date. Most datasets are updated quarterly, however some data is updated less frequently.

The diagram below shows an overview of the datasets currently available in the IDI. Click on the image to access a downloadable PDF.

The IDI contains de-identified data

The data held in the IDI is de-identified data. De-identified data has had personal identifiers removed or encrypted (ie replaced with another number) so that data records are not associated with named individuals. De-identification is one of the ways we keep the data safe – it supports analytical insights while maintaining privacy and confidentiality.

Core data

These data tables are available to all IDI researchers with an approved research project.

Security concordance

What the data is about: An encrypted identifier for each identity in the IDI that is common across all datasets. Allows a researcher to link between the datasets they have access to, and shows where individuals do not link or are not present.

Personal details

What the data is about: Stats NZ’s best estimate of demographic information, such as sex and birth month/year, derived from multiple collections in the IDI using a set of specific rules. Ethnicity variables are an ‘ever-indicator’ that shows all ethnicities an identity has recorded across data collections over time.

Source ranked ethnicity

What the data is about: Ranking of ethnicity information from all data sources. The ranking is based on an investigation by Stats NZ’s Statistical Methods team, which looked at the potential for administrative data to provide ethnicity information comparable to the current census and the official series of ethnic population estimates.

Estimated resident population (ERP)

What the data is about: Identifies individuals who are resident within New Zealand on a given date. The table relies on individuals having activity within administration data sources, and consists of snapshots of the population at a specific point in time.

Restricted data

We grant access to the following data on a case-by-case basis where a research project meets access criteria. This ensures approved researchers only get access to the IDI data essential for their project.

Benefits and social services data

ACC injury claims data

Source: Accident Compensation CorporationTime: From 1994What the data is about: All claims made due to work-related or non-work related injury, whether the injury occurred in New Zealand or not.See IDI Data Dictionary: ACC injury claims data for more information.Application code: ACC

Benefit dynamics data

Source: Ministry of Social Development Time: From 1990 What the data is about: People who have received a working-age social welfare benefit. Provides basic information on demographic characteristics, and traces their changing benefit status and other circumstances. Also traces the benefit histories of partners and dependent children included in benefits. Note: data dictionary available on the IDI Wiki in the Data Lab. Application code: MSD

Child, Youth and Family data

Source: Ministry of Social DevelopmentTime: From 1991What the data is about: A child or young person (CYP). Information ranges from instances of neglect or abuse, concerns about behaviour or care, and offences committed by a CYP. See IDI Data Dictionary: Child, Youth and Family data for more information.Application code: CYF

Children's Action Plan

Source: Ministry of Social DevelopmentTime: From 2013What the data is about: This dataset has demographic and referral information of vulnerable children under CAP's care.Application code: CAP

Family Start

Source: Oranga Tamariki - Ministry for ChildrenTime: From 2008What the data is about: Family Start is an intensive early intervention home visitation programme for families with young children aged 0 - 12 months at the time of referral, who are at risk of poor life outcomes. The programme targets the highest-need families and provides practical advice, support and parenting education to ensure that their children have the best possible start in life. Application code: OTFS

Student loans and allowances data from Inland Revenue

Source: Inland RevenueTime: From 1992What the data is about: Consolidated data extract (CDE) made of a series of student-loan-specific variables and unit-record monthly data (URMD), which is used for several purposes including contributing to the valuation of the student loans scheme.See IDI Data Dictionary: Student loans and allowances data from Inland Revenue for more information. Application code: SLA

Student loans and allowances data from StudyLink

Source: Ministry of Social DevelopmentTime: From 1992What the data is about: Data provided by applicants, their education provider(s), and related parties, that is used by StudyLink to assess eligibility and entitlement to student allowances, student loans, and scholarships.See IDI Data Dictionary: Student loans and allowances data from StudyLink for more information.Application code: SLA

Working for Families (WFF) research dataset

Source: Inland RevenueTime: From 2003What the data is about: A package designed to help make it easier to work and raise a family. It consists of WFF tax credits, Accommodation Supplement, and Childcare Assistance.See IDI Data Dictionary: Working for Families research data for more information.Application code: WFF

Youth services data

Source: Ministry of Social Development Time: From 2004 What the data is about: Young people who are not in education, employment, or training (NEET) or are at risk of becoming NEET. The service aims to reduce the likelihood these youth will become welfare dependent, and to contribute to better life outcomes for them. Note: data dictionary available on the IDI Wiki in the Data Lab.
See IDI Data Dictionary: MSD Youth Service Interventions for more information.Application code: YST

Education and training data

Industry training education data

Source: Ministry of Education Time: From 2001 What the data is about: Individuals in programmes administered by an Industry Training Organisation in each training fund in a calendar year.See IDI Data Dictionary: Industry training education data for more information. Application code: MOE

Primary and secondary schools data

Source: Ministry of Education Time: From 2007 What the data is about: Students enrolled at New Zealand schools, including student interventions, student NCEA qualifications attained, student qualifications and standards attained, university entrance attained and student Expected Percentile Score. See IDI Data Dictionary: Primary and secondary schools data for more information.Application code: MOE

Programme for International Assessment of Adult Competencies (PIAAC)

Source: Ministry of Education Time: 2014What the data is about: The PIAAC survey provides insights into general adult competencies and looks at areas like numeracy, literacy, educating, skills, work profile, and personal details. Note: codebook available on the IDI Wiki in the Data Lab.Application code: PIAAC

Tertiary education data

Source: Ministry of Education Time: From 1994 What the data is about: Students enrolled or who have completed qualifications in formal/non-formal tertiary qualifications at government-funded tertiary education organisations. See IDI Data Dictionary: Tertiary education data for more information. Application code: MOE

Geographic data

Address notification

Source: Multiple sourcesWhat the data is about: Prioritised address history for all snz_uid where address information exists. Uses a simple set of business rules to limit the full address table to a best-guess list of residential addresses. Where possible, Statistics NZ has provided a corresponding meshblock, territorial authority, and regional council code for each snz_uid.Application code: GEOGRAPHIC

Address notification – full

Source: Multiple sourcesWhat the data is about: Address information from all sources, providing a full list of every geocoded address and collating all address change notifications. Researchers can apply their own rules to filter the table to best suit their needs. Where possible, Statistics NZ has provided a corresponding meshblock, territorial authority, and regional council code for each snz_uid.Application code: GEOGRAPHIC

Person overseas spell

Source: Multiple sourcesWhat the data is about: A summary of all border movements by individuals in MBIE’s immigration data.Application code: GEOGRAPHIC

Note: geocoding allows us to make location information available to researchers as small geographic units called meshblocks, which keep specific addresses anonymous.

Health data

B4 School Check

Source: Ministry of HealthTime: From 2011What the data is about: Results of the B4 School Check for each individual who undergoes the checks, and information about those who have declined. The B4 School Check aims to promote health and well-being in four-year-olds, and identify and address concerns that could affect a child’s ability to get the most benefit from school. See IDI Data Dictionary: B4 School check data for more information.Application code: MOH_B4SC

Cancer registrations

Source: Ministry of HealthTime: From 1995What the data is about: A subset of fields from the NZ Cancer Registry. Specifically, information on malignant cancer registrations for healthcare users in the population cohort. See IDI Data Dictionary: Cancer registrations data for more information.Application code: MOH_CANCER_REG

Health Tracker

Source: Ministry of HealthTime: 2006–13What the data is about: Data from Health Tracker, a health-focused census of people living in New Zealand, using data from national health data collections held by the Ministry of Health. Note: Located in the IDI sandpit database. The data dictionary is available on the IDI Wiki in the Data Lab.Application code: MOH_HEALTH_TRACKER

Laboratory claims data

Source: Ministry of HealthTime: From 2003What the data is about: A subset of fields from the Laboratory Claims Collection. Specifically, contains claim and payment information for laboratory tests processed by the General Transaction Processing System (GTPS). Also contains laboratory test information from Pegasus IPA and Medlab South. Note: Data dictionary available on the IDI Wiki in the Data Lab.Application code: MOH_LAB_CLAIMS

Maternity data

Source: Ministry of HealthTime: From 2003What the data is about: These tables contain information relating to selected publicly funded maternity and newborn services, up to nine months before and three months after a birth. Note: Data dictionary available on the IDI Wiki in the Data Lab.Application code: MOH_MATERNITY

Mortality data

Source: Ministry of HealthTime: From 1988What the data is about: Data classifying the underlying cause of death for all deaths registered in New Zealand, including all registered fetal deaths (stillbirths), using the World Health Organization Rules and Guidelines for Mortality Coding. Note: Data dictionary available on the IDI Wiki in the Data Lab.Application code: MOH_MORTALITY

National Immunisation Register (NIR)

Source: Ministry of HealthTime: From 2006What the data is about: Data derived from unit record immunisation event information. A subset of all data available within the NIR. The NIR collection provides data for monitoring immunisation coverage and the progress of immunisation campaigns such as meningococcal B and HPV. Note: Data dictionary is available on the IDI Wiki in the Data Lab.Application code: MOH_NIR

National Needs Assessment and Service Coordination Information (SOCRATES)

Source: Ministry of HealthWhat the data is about: The National Needs Assessment and Service Coordination Information (SOCRATES) is used by Ministry-funded Needs Assessment and Service Coordination (NASC) agencies to record information about clients who are eligible for Disability Support Services (DSS).Application code: MOH_SOCRATES

Pharmaceutical data

Source: Ministry of HealthTime: From 2005What the data is about: Subset of fields from the Pharmaceutical Claims Collection. Specifically, contains information about subsidised dispensings processed by the General Transaction Processing System (GTPS), including demographic information about healthcare users to whom these prescriptions were dispensed.See IDI Data Dictionary: Pharmaceutical data for more information.Application code: MOH_PHARMACEUTICAL

PHO enrolment data

Source: Ministry of HealthTime: From 2003What the data is about: Subset of fields from the PHO Enrolment collection. Specifically, contains information about the demographics of the population cohort enrolled in a PHO, along with when they were enrolled, when they were last seen, and the practice type.Application code: MOH_PHO_ENROLMENT

Population cohort addresses

Source: Ministry of HealthTime: 2005 to 2013What the data is about: Subset of fields from the NHI. Specifically, contains all address information (including domicile code) for each healthcare user in the population cohort. Note: data dictionary is available on the IDI Wiki in the Data Lab.Application code: MOH_ADDRESS

Programme for the Integration of Mental Health Data (PRIMHD)

Source: Ministry of HealthTime: From 2008What the data is about: Subset of fields from PRIMHD. Specifically, contains data about the referral, what services (activities) were provided, and demographic information. Excludes outcomes, diagnosis, and legal status data. See IDI Data Dictionary: Programme for the integration of mental health data for more information.Application code: MOH_PRIMHD

Housing data

Social housing data

Source: Housing New Zealand CorporationTime: From 1980What the data is about: Individuals waiting for social housing, those in social housing, details about the tenancy, and the property details of government housing. Note: data dictionary available on the IDI Wiki in the Data Lab.Application code: HNZ

Income and work data

Business Register

Source: Multiple sourcesTime: From 1999What the data is about: Enterprise-level data, Permanent Business Number (PBN)-level data, the relationship between business IRD numbers to enterprises from the Statistics NZ Business Register. Note: Data dictionary available on the IDI Wiki in the Data Lab.Application code: BR (Note: access restricted to government researchers)

Business-centred data is usually accessed via Statistics NZ’s Longitudinal Business Database (LBD). The LBD is a linked longitudinal database that includes a range of business information. It can be used in conjunction with the person-centred data in the IDI. Access is restricted to government researchers. See Longitudinal Business Database.

New Zealand Income Survey

Source: Stats NZTime: From 2006What the data is about: Sample data about sources of incomes, including wages and salaries, self-employment, government transfers, other private transfers, and investments. See IDI Data Dictionary: New Zealand Income Survey for more information.Application code: HLFS

Survey of Family Income and Employment (SoFIE)

Source: Stats NZTime: 2002–10What the data is about: Sample data about respondents’ work, family, household circumstances, income and net worth, and studies, and how these change over time.See Survey of Family, Income, and Employment (SoFIE) for more information.Application code: SOFIE

Tax data (for non-government researchers)

Source: Inland RevenueTime: From 1999What the data is about: A subset of the employee monthly schedule (EMS) table, containing tax information relating to individuals. No access to business data. See IDI Data Dictionary: IR tax data for more information.Application code: IR_RESTRICT

Tax derived tables

Source: Multiple sourcesTime: From 1999What the data is about: In March 2014, Statistics NZ introduced six derived tables after ceasing the use of Linked Employer-Employee Dataset (LEED) research tables. Release of the tables followed a period of consultation with various agencies, where they were asked which LEED tables they used and how these were used. Note: Data dictionary available on the IDI Wiki in the Data Lab.Application code: IR (Note: access restricted to government researchers)

Recorded crime offenders data

Source: New Zealand PoliceTime: From 2009What the data is about: Alleged offenders recorded by NZ Police as prescribed in the NZ Police National Recording Standard, who have been proceeded against by police.See IDI Data Dictionary: Recorded crime offenders data for more information.Application code: POL

Recorded crime victims data

Source: New Zealand PoliceTime: From 2014What the data is about: Victims of crime recorded by NZ Police as prescribed in the NZ Police National Recording Standard, where the matter came to Police attention.See IDI Data Dictionary: Recorded crime victims data for more information.Application code: POL

Sentencing and remand data

Source: Department of CorrectionsTime: From 1998What the data is about: Data about Corrections’ management of convicted adult offenders who have received a community sentence or imprisonment, and people remanded until their trial is completed. See IDI Data Dictionary: Sentencing and remand data for more information. Application code: COR

People and communities data

Auckland City Mission data

Source: Auckland City MissionTime: From 1996 What the data is about: Income, expenses, housing status, and household composition of Auckland City Mission clients, and the services these clients use. Auckland City Mission is a social service provider in Auckland CBD, that helps Aucklanders in need by providing effective integrated services and advocacy. Note: data dictionary available on the IDI Wiki in the Data Lab.Application code: ACM

Disability Survey

Source: Stats NZTime: 2013What the data is about: Data about the number of disabled children and adults living in New Zealand, including the nature and cause of impairments, the type of support needed, and how well disabled people are faring compared with non-disabled people.See Disabilities for more information.Application code: DS

Driver licence and motor vehicle registers

Source: New Zealand Transport AgencyWhat the data is about: Driver licence information related to people who hold, or have held, a New Zealand driver licence; and motor vehicle related information for people with a vehicle currently registered in their name. See IDI Data Dictionary: Transport data for more information.Application code: NZTA

General Social Survey (GSS)

Source: Stats NZTime: From 2008What the data is about: Data about the well-being of New Zealanders aged 15 years and over, including a wide range of social and economic outcomes and how people are faring.See New Zealand General Social Survey - information releases for more information. Application code: GSS

Longitudinal Immigration Survey of NZ

Source: Stats NZTime: 2005–09What the data is about: Data from the longitudinal immigration survey of NZ, which is designed to trace the pathways of migrants and produce a detailed, ongoing information base of their experiences and settlement outcomes. See Longitudinal Immigration Survey of NZ for more information. Note: data dictionary available on the IDI Wiki in the Data Lab.Application code: LISNZ

Migrant Survey data

Source: Ministry of Business, Innovation and EmploymentTime: From 2012What the data is about: Data from the 2012 Migrant Survey, which is part of the Immigration Survey Monitoring Programme (ISMP). ISMP compiles information about migrants’ settlement and labour market outcomes, employers’ experience with migrants, and community attitudes towards immigration.See IDI Data Dictionary: Migrant Survey data for more information.Application code: MS

Te Kupenga

Source: Stats NZTime: 2013What the data is about: Data on the social, cultural, and economic wellbeing of Māori in New Zealand, including information about the health of the Māori language and culture.See Te Kupenga for more information.Application code: TK

Population data

2013 Census

Source: Stats NZTime: 5 March 2013What the data is about: People and dwellings in New Zealand on census night. Provides a snapshot of our society at a point in time.See IDI Data Dictionary: 2013 Census data for more information.Application code: CENSUS

Immigration data

Source: Ministry of Business, Innovation and EmploymentTime: From 1997What the data is about: Immigration administrative data on migrants and international visitors applying for a visa to enter New Zealand. Includes all resident visa applications and those applying for a temporary stay (work, study, and visitor). See IDI Data Dictionary: Immigration data for more information.Application code: DOL

International travel and migration data

Source: Stats NZTime: From 1997What the data is about: International passenger arrivals to, and departures from, New Zealand. The New Zealand Customs Service has supplied Statistics NZ with electronic passport and flight records, and these are combined with information from arrival and departure cards.See IDI Data Dictionary: International travel and migration data for more information.Application code: CUS

Life event data

Source: Department of Internal AffairsTime: From 1840What the data is about: Life events relating to births, deaths, marriages, and civil unions registered in New Zealand.See IDI Data Dictionary: Life event data for more information.Application code: DIA